jepusto / clubSandwich

Cluster-robust (sandwich) variance estimators with small-sample corrections
http://jepusto.github.io/clubSandwich/
47 stars 8 forks source link

Problem when using a fixest::feols-object #89

Open GustavFlorinAagren opened 1 year ago

GustavFlorinAagren commented 1 year ago

Hi! This seems like a great tool I should use in a project I'm working on. Though, when I try it out with a model estimated with fixest::feols the results get weird.

I followed the tutorial (found on https://cran.r-project.org/web/packages/clubSandwich/vignettes/panel-data-CRVE.html) and first, put: coef_test(twfe_naive, vcov = "CR1", cluster = AJZ$Region, test = "naive-t")[1:14,]

and get: Coef. Estimate SE t-stat d.f. (naive-t) p-val (naive-t). Sig. Year::2000:treat 0.00466 7.19e-15 6.48e+11 6 <0.001 Year::2001:treat -0.02283 7.21e-15 -3.16e+12 6 <0.001 Year::2002:treat 0.00995 6.99e-15 1.42e+12 6 <0.001 Year::2003:treat 0.05794 6.45e-15 8.99e+12 6 <0.001 Year::2004:treat 0.03210 6.64e-15 4.83e+12 6 <0.001 Year::2005:treat 0.01422 7.43e-15 1.91e+12 6 <0.001 Year::2007:treat 0.02797 5.15e-15 5.43e+12 6 <0.001 Year::2008:treat 0.07875 7.37e-15 1.07e+13 6 <0.001 Year::2009:treat 0.02238 6.97e-15 3.21e+12 6 <0.001 Year::2010:treat 0.12855 7.13e-15 1.80e+13 6 <0.001 Year::2011:treat 0.16441 7.37e-15 2.23e+13 6 <0.001 Year::2012:treat 0.08878 6.55e-15 1.35e+13 6 <0.001 Year::2013:treat 0.13912 6.78e-15 2.05e+13 6 <0.001 Year::2014:treat 0.12761 6.88e-15 1.85e+13 6 <0.001

But when I then tried the next step: coef_test(twfe_naive, vcov = "CR2", cluster = AJZ$Region, test = "Satterthwaite")[1:14,]

I get: Coef. Estimate SE t-stat d.f. (Satt) p-val (Satt) Sig. Year::2000:treat 0.00466 0 Inf NaN NA NA Year::2001:treat -0.02283 0 -Inf NaN NA NA Year::2002:treat 0.00995 0 Inf NaN NA NA Year::2003:treat 0.05794 0 Inf NaN NA NA Year::2004:treat 0.03210 0 Inf NaN NA NA Year::2005:treat 0.01422 0 Inf NaN NA NA Year::2007:treat 0.02797 0 Inf NaN NA NA Year::2008:treat 0.07875 0 Inf NaN NA NA Year::2009:treat 0.02238 0 Inf NaN NA NA Year::2010:treat 0.12855 0 Inf NaN NA NA Year::2011:treat 0.16441 0 Inf NaN NA NA Year::2012:treat 0.08878 0 Inf NaN NA NA Year::2013:treat 0.13912 0 Inf NaN NA NA Year::2014:treat 0.12761 0 Inf NaN NA NA

Why do I get this kind of result? Is it because it's a fixest::feols-model? Or could it be due to the data? More than happy to get help in getting to know what is wrong.

jepusto commented 1 year ago

Unfortunately, the package does not currently support fixest (see #61). Doing so requires providing some S3 methods for feols objects, and the default methods fail for CR2-type vcov matrices (due to lack of a targetVariance() method).

Would you be able to provide a minimal working example of the issue you've encountered? If so, it would help me to debug and perhaps I can get fixest::feols working in short order.

Alternately (or additionally), the package does support panel data models estimated with the plm package, so you could try that instead.

GustavFlorinAagren commented 1 year ago

When you say working example, do you mean data and code for the project?

jepusto commented 1 year ago

Here's some background on MWE. It would not require the actual data and code for your project. It would be fine to use artificial data or an example dataset from the fixest package. The idea is to provide code that is a) as simple as possible but b) still demonstrates the issue you've identified.

GustavFlorinAagren commented 1 year ago

I see. Well then interestingly this call (with the fixest example data called base_did) doesn't render the same problem: data(base_did)

est_did <- feols(y ~ x1 + i(period, treat, 5) | id + period, base_did) est_did

coef_test(est_did, vcov = "CR1", cluster = base_did$id, test = "naive-t")[1:10,]

The result of the naive approach is: Coef. Estimate SE t-stat d.f. (naive-t) p-val (naive-t) Sig. x1 0.973 0.0453 21.484 107 < 0.001 period::1:treat -1.403 1.6718 -0.839 107 0.40321
period::2:treat -1.248 1.6492 -0.756 107 0.45106
period::3:treat -0.273 1.7299 -0.158 107 0.87481
period::4:treat -1.796 1.6139 -1.113 107 0.26836
period::6:treat 0.784 1.4843 0.528 107 0.59825
period::7:treat 3.599 1.4571 2.470 107 0.01510
period::8:treat 3.812 1.6642 2.290 107 0.02396 * period::9:treat 4.731 1.7559 2.695 107 0.00819 * period::10:treat 6.606 1.7742 3.724 107 < 0.001

coef_test(est_did, vcov = "CR2", cluster = base_did$id, test = "Satterthwaite")[1:10,]

The result of the last call is: Coef. Estimate SE t-stat d.f. (Satt) p-val (Satt) Sig. x1 0.973 0.0456 21.330 9.32 <0.001 *** period::1:treat -1.403 1.6812 -0.835 1.33 0.527
period::2:treat -1.248 1.6588 -0.752 1.33 0.562
period::3:treat -0.273 1.7403 -0.157 1.33 0.896
period::4:treat -1.796 1.6236 -1.106 1.33 0.429
period::6:treat 0.784 1.4931 0.525 1.33 0.673
period::7:treat 3.599 1.4656 2.456 1.33 0.193
period::8:treat 3.812 1.6732 2.278 1.33 0.210
period::9:treat 4.731 1.7656 2.680 1.33 0.174
period::10:treat 6.606 1.7848 3.701 1.33 0.117

So these results don't seem to have the same problem as mine, do they? With this I mean that there doesn't seem to be a problem caused by fixest::feols. The problem must be coming from my data I suppose.

GustavFlorinAagren commented 1 year ago

Realizing that the example I created had an x1-variable first on the RHS before the i(), I tried the same using my data on a model with an additional variable in front.

coef_test(twfe, vcov = "CR1", cluster = AJZ$Region, test = "naive-t")[1:15,]

gives me:

                    Coef. Estimate      SE  t-stat d.f. (naive-t) p-val (naive-t)     Sig.

top1_logwealth 0.62777 0.74221 0.8458 6 0.4301 Year::2000:treat 0.12770 0.14547 0.8778 6 0.4138 Year::2001:treat 0.06459 0.10335 0.6250 6 0.5550 Year::2002:treat 0.05977 0.05890 1.0148 6 0.3494 Year::2003:treat 0.03822 0.02332 1.6389 6 0.1524 Year::2004:treat 0.02799 0.00486 5.7568 6 0.0012 Year::2005:treat 0.03153 0.02047 1.5404 6 0.1744 ** Year::2007:treat -0.01111 0.04621 -0.2405 6 0.8179 Year::2008:treat -0.00203 0.09551 -0.0212 6 0.9837 Year::2009:treat 0.00271 0.02325 0.1165 6 0.9111 Year::2010:treat 0.02857 0.11821 0.2417 6 0.8171 Year::2011:treat 0.06810 0.11386 0.5981 6 0.5716 Year::2012:treat 0.01094 0.09203 0.1189 6 0.9092 Year::2013:treat 0.03407 0.12420 0.2743 6 0.7930 Year::2014:treat 0.01361 0.13478 0.1010 6 0.9229

and coef_test(twfe, vcov = "CR2", cluster = AJZ$Region, test = "Satterthwaite")[1:15,]

gives me:

                        Coef. Estimate      SE  t-stat d.f. (Satt) p-val (Satt) Sig.

top1_logwealth 0.62777 0.78554 0.7992 1 0.571
Year::2000:treat 0.12770 0.15396 0.8294 1 0.559
Year::2001:treat 0.06459 0.10938 0.5905 1 0.660
Year::2002:treat 0.05977 0.06234 0.9588 1 0.513
Year::2003:treat 0.03822 0.02468 1.5485 1 0.365
Year::2004:treat 0.02799 0.00515 5.4393 1 0.116
Year::2005:treat 0.03153 0.02167 1.4554 1 0.383
Year::2007:treat -0.01111 0.04890 -0.2272 1 0.858
Year::2008:treat -0.00203 0.10108 -0.0201 1 0.987
Year::2009:treat 0.00271 0.02461 0.1100 1 0.930
Year::2010:treat 0.02857 0.12511 0.2283 1 0.857
Year::2011:treat 0.06810 0.12051 0.5651 1 0.673
Year::2012:treat 0.01094 0.09740 0.1123 1 0.929
Year::2013:treat 0.03407 0.13145 0.2592 1 0.839
Year::2014:treat 0.01361 0.14265 0.0954 1 0.939

So using this call: twfe <- fixest::feols(logwealth ~ top1_logwealth + i(Year, treat, ref = 2006) | Region + Year, data = AJZ)

instead of: twfe_results <- fixest::feols(logwealth ~ i(Year, treat, ref = 2006) | Region + Year, cluster = "Region", data = AJZ)

gives me a result without any NAs as in the first post in this thread. Strange. Do you have any idea how this can be?

jepusto commented 1 year ago

Not sure. Again, it would be helpful to have a working example that I can run and that demonstrates the issue. Is there a way to produce the error using the base_did data?

GustavFlorinAagren commented 1 year ago

Using the base_did data and the call I used when I got the NAs (in the first post) does not produce the same error. I guess it must have to do with the data I'm using in some way then.

GustavFlorinAagren commented 1 year ago

Here is part of the data and part of the code I use. This will produce the same error when trying to estimate the standard errors with CR2 and Satterthwaite correction (the second coef_test). AJZ_example.csv

rm(list=ls()) library(tidyverse) library(tidyr) library(dplyr) library(did) library(haven) library(fixest) library(Rglpk) library(clubSandwich)

Import data

AJZ <- read.csv("/Users/gustavaagren/Documents/NEK/Project/Data/AJZ_example.csv", sep = ";", col.names = c('Region','Year', 'onshore_wealth', 'offshore_share'))

As.numeric

AJZ$onshore_wealth <- as.numeric(AJZ$onshore_wealth)

Create new variable of offshore_wealth

AJZ <- AJZ %>% mutate(offshore_wealth = (onshore_wealth/(1-offshore_share))*offshore_share )

Create new variable of log(offshore_wealth)

AJZ <- AJZ %>% mutate(logwealth = log(offshore_wealth) )

New variable for assigning region 2 as treatment-group

AJZ$treat <- c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)

non-weighted model

Run the TWFE spec

twfe_unweighted <- fixest::feols(logwealth ~ i(Year, treat, ref = 2006) | Region + Year, cluster = "Region", data = AJZ) summary(twfe_unweighted)

coef_test with naive approach

coef_test(twfe_unweighted, vcov = "CR1", cluster = AJZ$Region, test = "naive-t")[1:14,]

coef_test conservative adjusted approach

coef_test(twfe_unweighted, vcov = "CR2", cluster = AJZ$Region, test = "Satterthwaite")[1:14,]

This is the output from the second coef_test: Coef. Estimate SE t-stat d.f. (Satt) p-val (Satt) Sig. Year::2000:treat 0.00466 0 Inf NaN NA NA Year::2001:treat -0.02283 0 -Inf NaN NA NA Year::2002:treat 0.00995 0 Inf NaN NA NA Year::2003:treat 0.05794 0 Inf NaN NA NA Year::2004:treat 0.03210 0 Inf NaN NA NA Year::2005:treat 0.01422 0 Inf NaN NA NA Year::2007:treat 0.02797 0 Inf NaN NA NA Year::2008:treat 0.07875 0 Inf NaN NA NA Year::2009:treat 0.02238 0 Inf NaN NA NA Year::2010:treat 0.12855 0 Inf NaN NA NA Year::2011:treat 0.16441 0 Inf NaN NA NA Year::2012:treat 0.08878 0 Inf NaN NA NA Year::2013:treat 0.13912 0 Inf NaN NA NA Year::2014:treat 0.12761 0 Inf NaN NA NA

GustavFlorinAagren commented 1 year ago

Hi @jepusto !

Can I ask if you have taken a look at the data and the output I posted?

jepusto commented 1 year ago

Not yet--I'll get to it tomorrow afternoon.

GustavFlorinAagren commented 1 year ago

Thanks for your fast reply. Fantastic that you want to look into it. Thanks!