sergiocorreia / ivreghdfe

Run IV/2SLS with many levels of fixed effects (i.e. ivreg2+reghdfe)
MIT License
83 stars 29 forks source link

standard errors different for ivreg2 and ivreghdfe #21

Closed krhw closed 5 years ago

krhw commented 5 years ago

Hello,

Would you be able to explain the source of the difference between the standard errors in ivreghdfe and ivreg2? Thanks.

Running the same regression with ivreghdfe and ivreg2 yields standard errors that are larger with ivreghdfe:

ivreghdfe outcome (tr = iv), absorb(i.year i.country_num) cluster(country_num)
(MWFE estimator converged in 2 iterations)

IV (2SLS) estimation
Estimates efficient for homoskedasticity only
Statistics robust to heteroskedasticity and clustering on country_num

Number of clusters (country_num) = 52 Number of obs = 1300
F( 1, 51) = 5.39
Prob > F = 0.0243
Total (centered) SS = 43.31870105 Centered R2 = -0.0120
Total (uncentered) SS = 43.31870105 Uncentered R2 = -0.0120
Residual SS = 43.83852081 Root MSE = .1855

         |               Robust
 outcome |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
tr | .2087836 .0899259 2.32 0.024 .0282499 .3893173
Underidentification test (Kleibergen-Paap rk LM statistic): 16.181
Chi-sq(1) P-val = 0.0001
Weak identification test (Cragg-Donald Wald F statistic): 1449.754
(Kleibergen-Paap rk Wald F statistic): 108.133
Stock-Yogo weak ID test critical values: 10% maximal IV size 16.38
15% maximal IV size 8.96
20% maximal IV size 6.66
25% maximal IV size 5.53
Source: Stock-Yogo (2005). Reproduced by permission.
NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors.
Hansen J statistic (overidentification test of all instruments): 0.000
(equation exactly identified)
Instrumented: tr
Excluded instruments: iv
Partialled-out: _cons
nb: total SS, model F and R2s are after partialling-out;
any small-sample adjustments include partialled-out
variables in regressor count K
Absorbed degrees of freedom:
-----------------------------------------------------+
Absorbed FE | Categories - Redundant = Num. Coefs |
-------------+---------------------------------------|
year | 25 0 25 |
country_num | 52 52 0 *|
-----------------------------------------------------+

= FE nested within cluster; treated as redundant for DoF computation
ivreg2 outcome (tr = iv) i.year i.country_num, partial(i.year i.country_num) cluster(country_num

)

IV (2SLS) estimation
Estimates efficient for homoskedasticity only
Statistics robust to heteroskedasticity and clustering on country_num

Number of clusters (country_num) = 52 Number of obs = 1300
F( 1, 51) = 5.17
Prob > F = 0.0272
Total (centered) SS = 43.31870105 Centered R2 = -0.0120
Total (uncentered) SS = 43.31870105 Uncentered R2 = -0.0120
Residual SS = 43.83852081 Root MSE = .1836

         |               Robust
 outcome |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
tr | .2087836 .0881959 2.37 0.018 .0359229 .3816443
Underidentification test (Kleibergen-Paap rk LM statistic): 16.181
Chi-sq(1) P-val = 0.0001
Weak identification test (Cragg-Donald Wald F statistic): 1391.718
(Kleibergen-Paap rk Wald F statistic): 103.804
Stock-Yogo weak ID test critical values: 10% maximal IV size 16.38
15% maximal IV size 8.96
20% maximal IV size 6.66
25% maximal IV size 5.53
Source: Stock-Yogo (2005). Reproduced by permission.
NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors.
Hansen J statistic (overidentification test of all instruments): 0.000
(equation exactly identified)
Instrumented: tr
Excluded instruments: iv
Partialled-out: 1991.year 1992.year 1993.year 1994.year 1995.year
1996.year 1997.year 1998.year 1999.year 2000.year
2001.year 2002.year 2003.year 2004.year 2005.year
2006.year 2007.year 2008.year 2009.year 2010.year
2011.year 2012.year 2013.year 2014.year 2.country_num
5.country_num 7.country_num 8.country_num 10.country_num
11.country_num 13.country_num 14.country_num
16.country_num 17.country_num 18.country_num
19.country_num 20.country_num 21.country_num
22.country_num 23.country_num 24.country_num
25.country_num 26.country_num 27.country_num
29.country_num 30.country_num 31.country_num
32.country_num 34.country_num 35.country_num
37.country_num 38.country_num 39.country_num
41.country_num 42.country_num 43.country_num
45.country_num 46.country_num 47.country_num
48.country_num 50.country_num 51.country_num
52.country_num 53.country_num 54.country_num
55.country_num 57.country_num 58.country_num
59.country_num 60.country_num 61.country_num
62.country_num 63.country_num 64.country_num
65.country_num _cons
nb: total SS, model F and R2s are after partialling-out;
any small-sample adjustments include partialled-out
variables in regressor count K
sergiocorreia commented 5 years ago

The first thing that comes to mind is that you are not using the small option of ivreg2, which does matter when you have a lot of fixed effects. For instance:

sysuse auto
ivreg2 price (weight=gear) i.turn , partial(i.turn)
ivreg2 price (weight=gear) i.turn , partial(i.turn) small
ivreghdfe price (weight=gear), a(turn)

Notice how ivreg2,small matches ivreghdfe, but ivreg2 doesn't.

Beyond that, smaller differences could be due to with how the degrees of freedom are computed, as ivreghdfe does a few more collinearity checks between the FEs that other packages.

krhw commented 5 years ago

Thanks - the standard errors do match in my previous example after adding the small option. I imagine that this difference would vanish asymptotically, when the number of observations is much larger than the number of FE?

sergiocorreia commented 5 years ago

Yes. But the problem is that usually with multi-way fixed effects, the ratio K/N (number of obs. over number of parameters) doesn't go to zero asymptotically, so the small option shouldn't be applied.

krhw commented 5 years ago

Great. Thanks very much for the details.

nelfunkel commented 5 years ago

Hi Sergio,

I tried to replicate this problem, but this time clustering standard errors. say:

sysuse auto ivreg2 price (weight=gear) i.turn , cluster(turn) partial(i.turn) 
ivreg2 price (weight=gear) i.turn , cluster(turn) partial(i.turn) small 
ivreghdfe price (weight=gear), cluster(turn) a(turn)

Do you know why the standard errors differ?