Closed xiaobaaaa closed 5 months ago
It turns out that computing the right degrees of freedoms with high dimensional fixed effects is a hard problem (see the documentation of reghdfe about this). PRs to be as good as reghdfe on this are welcome!
That being said, for small dimensional fixed effects, as in your example, you can obtain the correct standard errors by using categorical variables in the formula (instead of using fe).
I want to write that I observed the same behavior when running a regression with country*year fixed effects as described in the opening post.
Moreover, it seems like when running a regression programmatically, the notation fe(d1)*fe(d2)
does not work, whereas fe(d1)&*fe(d2)
does work with the following error:
ERROR: MethodError: no method matching *(::FixedEffectModels.FixedEffectTerm, ::FixedEffectModels.FixedEffectTerm)
Closest candidates are:
*(::Any, ::Any, ::Any, ::Any...) at C:\Users\kantorov\.julia\juliaup\julia-1.7.3+0~x64\share\julia\base\operators.jl:655
*(::Union{MathOptInterface.ScalarAffineFunction{T}, MathOptInterface.ScalarQuadraticFunction{T}, MathOptInterface.VectorAffineFunction{T}, MathOptInterface.VectorQuadraticFunction{T}}, ::T) where T at C:\Users\kantorov\.julia\packages\MathOptInterface\kCmJV\src\Utilities\functions.jl:3270
*(::SpecialFunctions.SimplePoly, ::Any) at C:\Users\kantorov\.julia\packages\SpecialFunctions\jqvAz\src\expint.jl:8
...
Stacktrace:
[1] top-level scope
@ l:\Localbitcoin\Code_Ilja\main.jl:124
@IljaK91 Please open a separate issue with a fully replicable code.
The degrees of freedom and Std.Error are not calculated correctly when there are collinear fixed effects.
Thanks for the great package! I have been using this package for quite some time now and its high performance has solved many of my problems. I found some bugs during use.
I use the nlswork dataset to do the tests. For stata:
For JULIA:
The Std.Error of reg2 and reg3 are the same, but there is a slight difference in the Std.Error of reg1. As for age, the Std.Errors in reg1, reg2, and reg3 are 3.93088, 3.92897, and 3.92897.
However, in stata, all results are the same in reghdfe and reg. The year and occ_code fixed effects are omitted due to colinearity. This is because reg2 and reg3 in julia have different degrees of freedom than reg1 (171 for reg1 ,and 198 for reg2 and reg3). Maybe the degrees of freedom are not calculated correctly when there are collinear fixed effects.
BTW, FixedEffectModels.jl also dropped the year and occ_code fixed effects, and the coefficients are all the same in reg1, reg2, and reg3.
I have also tested with other datasets and found similar problems. And what worries me is that it is still unclear to me whether there is a problem with the calculation of degrees of freedom and standard errors when there is collinearty between several fixed effects, but not complete collinearty.