Open tatyanaderyugina opened 6 years ago
Hi Tatyana,
F-statistics also do not get reported, which is clearly problematic for IV
I'm not sure I follow. On the first stage of an IV people often report F-Stats, but of the instruments (not the entire set of covariates), and these should still be computable unless you have many instruments or almost no cluster units.
I get a warning because the number of clusters is smaller than the number of fixed effects
I just tried a toy example with Stata's default dataset, and it seems that the issue is not that the number of clusters is smaller than the number of FEs, but than it's smaller than the number of included regressors:
sysuse auto
* 14 FEs, one regressor, cluster of 2 = runs fine
ivreghdfe price (gear=length), a(turn) cluster(foreign)
* 14 FEs, two regressors, cluster of 2 = warning
ivreghdfe price weight (gear=length), a(turn) cluster(foreign)
It does say that the use of the "partial" option may help fix this problem
Indeed. But as you pointed out, you only partial included regressors and not FEs. For instance, in the previous example, we can partial out the -weight- variable to remove the warning:
* If we partial -weight- we don't get a warning anymore
ivreghdfe price weight (gear=length), a(turn) cluster(foreign) partial(weight)
The older version of reghdfe on ssc used to report F-stats for IV regressions in these cases, so perhaps it wouldn't be difficult to bring them back?
That's true. If you run reghdfe with the old
option the program would actually run the SSC version:
reghdfe price weight (gear=length), a(turn) cluster(foreign) old
However, in the example above I still get a warning with the SSC version. Finally, I remember the new version improved some corner cases regarding when the F stat was shown (or not shown), so I would stick with the new version and just use partial().
Best, Sergio
Thank you for the reply, Sergio! But literally all the non-instrument controls are absorbed, so there's nothing left to partial out. And the number of instruments, while large, is still smaller than the number of clusters.
Perhaps more of a question than an issue. I'm running regressions of the form
ivreghdfe Y (X = Z) , absorb(i.F) cluster(C)
The number of fixed effects is very large (thus the use of reghdfe) and I get a warning because the number of clusters is smaller than the number of fixed effects. F-statistics also do not get reported, which is clearly problematic for IV. It does say that the use of the "partial" option may help fix this problem. However, the way partial() works in ivreg2 is that the user has to generate the variables to be partialed out, i.e., I cannot write "partial(i.F)" (I tried, ivreghdfe breaks down). But if I generate the indicators (let's call the set _F*), and run
ivreghdfe Y (X = Z) , absorb(i.F) partial(_F*) cluster(C)
it doesn't seem like there's any way reghdfe will recognize that i.F and _F* are the same variable. Thus, it seems like if I want F-statistics in this case, I just need to use ivreg2 and suffer through the slowness?
The older version of reghdfe on ssc used to report F-stats for IV regressions in these cases, so perhaps it wouldn't be difficult to bring them back?