Clustering SE and robust SE in `plm` library

EmilHafeez commented 1 year ago

I believe this can be closed until peer review for Paper 1 returns comments; discussion involved shifting to felm or similar package

Problem is that the commands to extract robust SEs from the plm.object do not support weighted plm's, which we of course use. Lloyd has a tentative workaround using an lm.object instead, though discussion also considered converting to felm() or similar

eriliawu commented 1 year ago

Implementing fixed effects and clustered standard errors with weights is possible with lfe library, with felm function.

See link here.

As the link above explained, the code is loosely like this: felm(y ~x1 + x2 | fixedeffect1 + fixedeffect2 | Ivvariable | clustervar1 + clustervar2, data = data) Replace a section with 0 if it's not necessary.

In the felm documentation, weights is an option.

EmilHafeez commented 1 year ago

Marking this as closed and as reference for when/if the Paper 1 (or any current implementations) require it

lloydheng commented 1 year ago

Second comment by Grant here compares the output from msummary of an felm object specifying robust, and the output from an lm object passed through coeftest. They are identical.

EmilHafeez commented 1 year ago

So,

Sample code below to either get robust standard errors or the clustered standard errors.

The solution for using robust standard errors is to convert the plm estimation command into an felm estimation command; this is straightforward, though requires specifying the weight vector slightly differently so that the weights can be found. If you summary() this mod.factor object directly, it will look like the plm() output; you need to tidy.felm() the mod.factor object output with the option se.type = "robust". This is the test.mod2 approach in the code below.
- Filtering the data to the correct months (and other subsets) beforehand rather than in the command is preferable because it prevents an error being thrown that details the lengths of the data argument and the weight argument are different.
- The solution for getting CLUSTER robust standard errors is to convert the plm estimation command into an felm estimation command with a slightly different setup. See example. If you summary() this mod.factor object directly, the results will look different than the plm() output. This is the test.mod1 approach in the code below.
- Again, filtering the data to the correct months (and other subsets) beforehand rather than in the command is preferable because it prevents an error being thrown that details the lengths of the data argument and the weight argument are different.

matched_by_location_analysis_df_analytic = matched_by_location_analysis_df %>% filter(matching_identifier==i)

  matched_by_location_analysis_df_analytic = within(matched_by_location_analysis_df_analytic, relative2.factor<-relevel(relative2.factor, ref="-3"))

  mod.factor <- plm(formula = calorie~treat*relative2.factor+as.factor(month),
                    data = matched_by_location_analysis_df_analytic, 
                    index = "regression_index", weights = (weight1*weight2), model = "within")

  test.mod1 <- felm(formula = calorie ~ treat*relative2.factor | regression_index | 0 | regression_index,
                 data = matched_by_location_analysis_df_analytic, weights = (matched_by_location_analysis_df_analytic$weight1*matched_by_location_analysis_df_analytic$weight2))

  test.mod2 <- felm(formula = calorie ~ treat*relative2.factor | regression_index,
                 data = matched_by_location_analysis_df_analytic, weights = (matched_by_location_analysis_df_analytic$weight1*matched_by_location_analysis_df_analytic$weight2))

*** Note that in the subsequent tidying chunk, you will need to remove the treat row, as felm() outputs an empty row for it. Consider adding something like filter(!grepl("^treat$", month)) to your pipe, where the caret and $ anchor the beginning of the string, to prevent removing all the treat:relative2.factor## rows you want to keep.

*** Also note that when running significance tests for the DiD plots, the linearHypothesis() test will throw an error for multicollinearity in the mod.factor felm object. You can suppress the error (a known known) with the option singular.ok, as in the below

for (i in 4:23) {
    tmp$p[i-1] <- linearHypothesis(mod.factor, singular.ok = TRUE, paste0(presum," = 6*treat:relative2.factor",i))[2,4]
  }

Brian-Elbel-s-Research-Projects / project-overview

Clustering SE and robust SE in `plm` library #65