Brian-Elbel-s-Research-Projects / project-overview

A summary of current and past research projects in Elbel lab
0 stars 0 forks source link

Clustering SE and robust SE in `plm` library #65

Closed eriliawu closed 1 year ago

EmilHafeez commented 1 year ago

I believe this can be closed until peer review for Paper 1 returns comments; discussion involved shifting to felm or similar package

Problem is that the commands to extract robust SEs from the plm.object do not support weighted plm's, which we of course use. Lloyd has a tentative workaround using an lm.object instead, though discussion also considered converting to felm() or similar

eriliawu commented 1 year ago

Implementing fixed effects and clustered standard errors with weights is possible with lfe library, with felm function.

See link here.

As the link above explained, the code is loosely like this: felm(y ~x1 + x2 | fixedeffect1 + fixedeffect2 | Ivvariable | clustervar1 + clustervar2, data = data) Replace a section with 0 if it's not necessary.

In the felm documentation, weights is an option.

EmilHafeez commented 1 year ago

Marking this as closed and as reference for when/if the Paper 1 (or any current implementations) require it

lloydheng commented 1 year ago

Second comment by Grant here compares the output from msummary of an felm object specifying robust, and the output from an lm object passed through coeftest. They are identical.

EmilHafeez commented 1 year ago

So,

Sample code below to either get robust standard errors or the clustered standard errors.

matched_by_location_analysis_df_analytic = matched_by_location_analysis_df %>% filter(matching_identifier==i)

  matched_by_location_analysis_df_analytic = within(matched_by_location_analysis_df_analytic, relative2.factor<-relevel(relative2.factor, ref="-3"))

  mod.factor <- plm(formula = calorie~treat*relative2.factor+as.factor(month),
                    data = matched_by_location_analysis_df_analytic, 
                    index = "regression_index", weights = (weight1*weight2), model = "within")

  test.mod1 <- felm(formula = calorie ~ treat*relative2.factor | regression_index | 0 | regression_index,
                 data = matched_by_location_analysis_df_analytic, weights = (matched_by_location_analysis_df_analytic$weight1*matched_by_location_analysis_df_analytic$weight2))

  test.mod2 <- felm(formula = calorie ~ treat*relative2.factor | regression_index,
                 data = matched_by_location_analysis_df_analytic, weights = (matched_by_location_analysis_df_analytic$weight1*matched_by_location_analysis_df_analytic$weight2))

*** Note that in the subsequent tidying chunk, you will need to remove the treat row, as felm() outputs an empty row for it. Consider adding something like filter(!grepl("^treat$", month)) to your pipe, where the caret and $ anchor the beginning of the string, to prevent removing all the treat:relative2.factor## rows you want to keep.

*** Also note that when running significance tests for the DiD plots, the linearHypothesis() test will throw an error for multicollinearity in the mod.factor felm object. You can suppress the error (a known known) with the option singular.ok, as in the below

for (i in 4:23) {
    tmp$p[i-1] <- linearHypothesis(mod.factor, singular.ok = TRUE, paste0(presum," = 6*treat:relative2.factor",i))[2,4]
  }