statsmodels / statsmodels

Statsmodels: statistical modeling and econometrics in Python
http://www.statsmodels.org/devel/
BSD 3-Clause "New" or "Revised" License
10.13k stars 2.88k forks source link

ENH: chisquare diagnostic test for models #3904

Open josef-pkt opened 7 years ago

josef-pkt commented 7 years ago

partition the (endog, exog) space and compute cell counts/proportions, use chisquare test as hypothesis tests, asymptotic covariance for chisquare statistic is not "trivial"

similar to Hosmer-Lemeshow test, or pearson chisquare test with MLE based on grouped data, but more general theory, i.e. applies to arbitrary types of endog and exog (continuous or discrete)

The JoE article looks good, I haven't read the Econometrica article

Andrews, Donald W. K. 1988a. “Chi-Square Diagnostic Tests for Econometric Models: Theory.” Econometrica 56 (6): 1419–53. doi:10.2307/1913105.

———. 1988b. “Chi-Square Diagnostic Tests for Econometric Models.” Journal of Econometrics 37 (1): 135–56. doi:10.1016/0304-4076(88)90079-6.

Manjón, M., and O. Martínez. 2014. “The Chi-Squared Goodness-of-Fit Test for Count-Data Models.” Stata Journal 14 (4): 798–816. (no access to this, but that's where I got the Andrews reference from) https://www.stata.com/meeting/spain12/abstracts/materials/Majon_Martinez.pdf slides about the Stata implementation update article available at https://journals.sagepub.com/doi/pdf/10.1177/1536867X1401400406

One application: specification test for count models based on observed versus predicted frequencies, but Andrews JOE has many examples application for distributional tests. The advantage is that it is very general and allow for any asymptotically normal sqrt(n) consistent estimators, and allows a wide range of methods for data dependent partitioning of (y, x). (i.e. it would cover MLE, QMLE, M-estimators, GMM as long as we can get predictive counts from the model. Aside: There is also a draft version of GMM estimation of distributions for grouped data.)

3887 follow-up, open issues when we have more count models to choose from, including diagnostics

1288 binned gof test for Monte Carlo

3897 Vuong test comparing non-nested models

https://github.com/statsmodels/statsmodels/issues/2041#issuecomment-249098368 Hosmer-Lemeshow related

josef-pkt commented 7 years ago

One thing that's not clear to me: df for chi-square test is rank(covariance of test variable/frequencies) JOE article page 139 and footnote 1: If estimator is asymptotically equivalent to an estimator that minimizes chi-square test statistic, then df is reduced by number of parameters (should be in rank)

A property of GLM with canonical link is that the ~aggregate frequency~ (edit correction) conditional expectation/mean corresponds to the observed ~frequencies~ mean for categorical dummies. In this case the difference between observed and predicted frequency is zero. This should show up in the distribution of the test statistic, maybe it affects only the rank and df. edit Cameron Trivedi count book 2nd ed. p. 194 bottom of page: rank(V) < J - 1 in binary and multinomial models.

example saturated model with only categorical predictors has pearson chisquare equal to zero in GLM/LEF. I guess: The number of parameters is equal to the number of cells and df=0. We only get a non-zero test statistic and df > 0, if we have more cells than parameters. That would be similar to that we cannot use GMM J-test for specification testing if we don't have overidentifying restrictions .

josef-pkt commented 4 years ago

I got the Manjón, M., and O. Martínez. 2014 Stata journal article It's gof for count models. Their package and function should be good for unit tests.

They use a CMT with OPG-like variance estimate. OPG or artificial regression for it will be easier to implement in some model where we don't have cov for prediction and/or full hessian for expanded moment conditions, e.g. ordinal OrderedModel main reference besides original Andrews articles are books by Cameron and Trivedi and an working paper by Greene about excess zeros

theoretical aside: they mention that using standard gof test on binned frequencies ignores that parameters are estimated.

quick check of Greene's working paper (I downloaded SSRN version in 2017) section 3.1.2 Specification Testing mentions testing for excess zeros referencing Mullahy 1986

I'm don't remember what I had already read and used zero inflation specification test discrete._diagnostics-count mentions Jansakul and Hinde 2009 second function has _brock as postfix in name, no reference there unit test for those are not verified with another package

josef-pkt commented 4 years ago

I already have the OPG with auxiliary regression version similar to Manjón, M., and O. Martínez. 2014 in discrete._diagnostics_count.test_chisquare_prob

docstring doesn't include references, I only had the slides of MM and the original Andrews article AFAIR.

That module was added as part of count models in #3908 I wrote it mainly to evaluate and get some checks on the new implemented count models.

unit test for it are only regression tests.

It looks like it applies or can be extended to any model with integer valued endog. It needs extensions if we want to use it for binned continuous endog, replacing endog = arange by interval membership.

josef-pkt commented 3 years ago

choice of number and location of bins for binned tests like chisquare

Rolke, Wolfgang, and Cristian Gutierrez Gongora. 2020. “A Chi-Square Goodness-of-Fit Test for Continuous Distributions against a Known Alternative.” Computational Statistics, May. https://doi.org/10.1007/s00180-020-00997-x.

I didn't look at the article, but it has a good reference list. I found it because it is a recent citing article for

Williams, C. Arthur. 1950. “On the Choice of the Number and Width of Classes for the Chi-Square Test of Goodness of Fit.” Journal of the American Statistical Association 45 (249): 77–86. https://doi.org/10.2307/2280429.