ENH: cov_type for nonlinear two-stage models

Mainly parking a issue and references. I was just skimming parts.

In treatment effect we use GMM for cov_params which is a robust cov_type. If we want nonrobust cov_type, then we need to skip some of the robust computations.

Based on my skimming there are different versions of the two-stage cov_params, either nonrobust (correct specification) or robust (sandwiches for some misspecification):

Murphy-Topel seems to be OPG (*)
another version uses hessian
Terza article directly simplifies the computation exploiting the information matrix equality

I don't know what (our) heckman uses.

Hole, Arne Risa. “Calculating Murphy–Topel Variance Estimates in Stata: A Simplified Procedure.” The Stata Journal 6, no. 4 (November 1, 2006): 521–29. https://doi.org/10.1177/1536867X0600600405.

Palmer, Tom M, Michael V Holmes, Brendan J Keating, and Nuala A Sheehan. “Correcting the Standard Errors of 2-Stage Residual Inclusion Estimators for Mendelian Randomization Studies.” American Journal of Epidemiology 186, no. 9 (November 1, 2017): 1104–14. https://doi.org/10.1093/aje/kwx175.

Terza, Joseph V. “Simpler Standard Errors for Two-Stage Optimization Estimators.” The Stata Journal 16, no. 2 (June 1, 2016): 368–85. https://doi.org/10.1177/1536867X1601600206.

Newey, Whitney K. “A Method of Moments Interpretation of Sequential Estimators.” Economics Letters 14, no. 2 (January 1, 1984): 201–6. https://doi.org/10.1016/0165-1765(84)90083-1. formula for method of moments, exactly identified GMM, using sandwiches for all parts.

(not clear to me yet what we need)

(*) update Murphy, Topel, section 5.1 two-step MLE equ. (29) assumes and specifies information matrix equality, R_i is name for both. In the following, they use R_i and so do not specify whether OPG or hessian is used in their final formula equ. (34) (AFAICS, in equ (33) R still refers to negative hessian, i.e. second derivatives, and omega in equ (30) and (31) is cov(score), i.e. using R for opg.)

statsmodels / statsmodels

ENH: cov_type for nonlinear two-stage models #8803