Open josef-pkt opened 1 year ago
We need helper functions, at least for Newey/GMM and Murphy and Topel
There should be some overlap with statsmodels.stats._diagnostic_other, e.g conditional_moment_test_generic
Actually, I'm not sure we really need to use it. It's mainly computational if we have a large number of moment conditions. We can just use the appropriate submatrix/block of the joint cov_params instead of using partitioned matrix inverse. (one large matrix inverse instead of many computations with smaller matrices)
Mainly parking a issue and references. I was just skimming parts.
In treatment effect we use GMM for cov_params which is a robust cov_type. If we want nonrobust cov_type, then we need to skip some of the robust computations.
Based on my skimming there are different versions of the two-stage cov_params, either nonrobust (correct specification) or robust (sandwiches for some misspecification):
I don't know what (our) heckman uses.
Hole, Arne Risa. “Calculating Murphy–Topel Variance Estimates in Stata: A Simplified Procedure.” The Stata Journal 6, no. 4 (November 1, 2006): 521–29. https://doi.org/10.1177/1536867X0600600405.
Palmer, Tom M, Michael V Holmes, Brendan J Keating, and Nuala A Sheehan. “Correcting the Standard Errors of 2-Stage Residual Inclusion Estimators for Mendelian Randomization Studies.” American Journal of Epidemiology 186, no. 9 (November 1, 2017): 1104–14. https://doi.org/10.1093/aje/kwx175.
Terza, Joseph V. “Simpler Standard Errors for Two-Stage Optimization Estimators.” The Stata Journal 16, no. 2 (June 1, 2016): 368–85. https://doi.org/10.1177/1536867X1601600206.
Newey, Whitney K. “A Method of Moments Interpretation of Sequential Estimators.” Economics Letters 14, no. 2 (January 1, 1984): 201–6. https://doi.org/10.1016/0165-1765(84)90083-1. formula for method of moments, exactly identified GMM, using sandwiches for all parts.
(not clear to me yet what we need)
(*) update Murphy, Topel, section 5.1 two-step MLE equ. (29) assumes and specifies information matrix equality, R_i is name for both. In the following, they use
R_i
and so do not specify whether OPG or hessian is used in their final formula equ. (34) (AFAICS, in equ (33) R still refers to negative hessian, i.e. second derivatives, and omega in equ (30) and (31) is cov(score), i.e. using R for opg.)