sergiocorreia / ivreghdfe

Run IV/2SLS with many levels of fixed effects (i.e. ivreg2+reghdfe)
MIT License
77 stars 27 forks source link

[BUG] Incorrect fixed effects with `cluster()` #51

Open mcaceresb opened 7 months ago

mcaceresb commented 7 months ago
clear
sysuse auto, clear
ivreghdfe price (mpg = turn), absorb(a1=rep78) cluster(rep78)
ivreghdfe price (mpg = turn), absorb(a2=rep78) cluster(rep78)
gen diff = reldif(a1, a2)
which ivreghdfe
sum diff, d

gives

/home/mauricio/ado/plus/i/ivreghdfe.ado
*! ivreghdfe 1.1.3  04Jan2023 (bugfix for github issue #48)
*! ivreghdfe 1.1.2  29Sep2022 (bugfix for github issue #44)
*! ivreghdfe 1.1.1  14Dec2021 (experimental -margins- support)
*! ivreghdfe 1.1.0  25Feb2021
*! ivreg2 4.1.11  22Nov2019
*! authors cfb & mes
*! see end of file for version comments

                            diff
-------------------------------------------------------------
      Percentiles      Smallest
 1%            0              0
 5%            0              0
10%     .0145922              0       Obs                  74
25%     .0145922              0       Sum of Wgt.          74

50%     .0188667                      Mean           .0251712
                        Largest       Std. Dev.      .0174165
75%     .0292981       .0621538
90%     .0621538       .0621538       Variance       .0003033
95%     .0621538       .0621538       Skewness       1.148167
99%     .0621538       .0621538       Kurtosis       3.469954
mcaceresb commented 7 months ago

FYI it's different each run and I'm not sure what's causing the sort order of the saved FEs to change. This also causes xbd and resid to change, if they are used.

mcaceresb commented 7 months ago

@sergiocorreia Actually they're just wrong, sorry for the multiple messages:

clear
sysuse auto, clear
ivreghdfe price (mpg = turn), absorb(a1=rep78) cluster(rep78)
ivreghdfe price (mpg = turn), absorb(a2=rep78) cluster(rep78)
reghdfe mpg turn, absorb(rep78) resid
predict mpghat, xbd
reghdfe price mpghat, absorb(a3=rep78) cluster(rep78)
gen diff1 = reldif(a1, a3)
gen diff2 = reldif(a2, a3)
sum diff?
    Variable |        Obs        Mean    Std. Dev.       Min        Max
-------------+---------------------------------------------------------
       diff1 |         74    93.00652    119.3723          0   300.4831
       diff2 |         74    89.67282    112.5629          0   285.0808
sergiocorreia commented 5 months ago

For some reason, the problem is solved when pre-sorting by the fixed effects before the run:

sysuse auto, clear
sort rep78

ivreghdfe price turn, absorb(a1=rep78) cluster(rep78)
ivreghdfe price turn, absorb(a2=rep78) cluster(rep78)
reghdfe price turn, absorb(a3=rep78)  cluster(rep78)
replace a3 = a3 + _b[_cons]

gen diff1 = reldif(a1, a3)
gen diff2 = reldif(a2, a3)
sum diff?

But need to delve a bit more into why the sorting gets broken.