Closed jtilly closed 3 years ago
Thanks Jan! With your implementation, it shouldn't be hard to add to the package. I should be able to code up the clustered SE as well.
I should be able to code up the clustered SE as well.
When I had to implement clustered SEs, I found it convenient to transform the sandwich form A' B A
, where B(i, j)
is one if i
and j
belong to the same cluster and zero otherwise, into (A C)' (A C)
, where C
is the one-hot-encoded representation of the clusters. There may be better implementations out there though. :)
@lbittarello @jtilly, just a fyi: when implementing this for qc.glm, I found a typo in the non-robust finite-sample correction above (typo is not in the robust version).
The line vcov /= sum_weights - estimator.n_parameters
, should be vcov *= sum_weights / (sum_weights - estimator.n_parameters)
It's fixed in PR #383 , but if you are still using the above snippet it might be good to fix.
I guess that's why they say: Always use robust standard errors 😁
Thanks for the PR and thanks for letting us know!
The variance estimator takes the form (1/n) * inv(H) * [(G' G) / n] * inv(H')
, where H
is the expected Hessian, G
is the score matrix and [(G' G) / n]
is the OP or BHHH estimator. The function score_matrix
returns Ĝ ≡ G / n
instead of G
though, so we compute (Ĝ' Ĝ ) = (G' G) / n²
, which absorbs the leading 1/n
. That's why we adjust the robust estimator by n / (n – 1)
. In the case of the classic estimator, however, we use inv(H)
, so we're still missing the leading 1/n
, which is why we adjust it by 1 / (n – 1)
.
Anyway, that's my reading of Cameron and Trivedi (2005, §5.5.2).
We tested both implementations against statsmodels
. Perhaps we made a mistake and set the tolerance so high as to miss it?
Finally got some time to play around with this and get a satisfying answer. Basically, Luca's code is correct. The problem happens when I tried to integrate the code to our current codebase, which normalizes the sample weights to simplify some computations (normalized sample_weights always sum to 1). I needed to multiply by sum_weights to readjust for this change. I checked and both our implementation give the same thing (and they are equal or extremely close to statsmodels).
Another nice-to-have feature would be to add standard errors. It would be great to have plain vanilla and robust standard errors and maybe even allow for clustering. We already have an implementation that includes plain vanilla and robust standard errors:
Courtesy of @lbittarello