GLM with "foreign" variance function

josef-pkt commented 10 years ago

examples, unit tests

do we get the same results as with HetGLS if family is gaussian?

essentially do WLS, or feasible WLS, with GLM.

(GLM has a lot of mix-and-estimate options, and I don't think we have tried most of the "weird" combinations yet.)

josef-pkt commented 9 years ago

I'm adding 0.7 milestone as a reminder. I'm doubtful we'll get there for 0.7

Cameron Trivedi count book has a good summary of GLM/LEF with non-native variance function.

Another application: fractional or proportion response I'm just looking at the variance function in PR #2030 for beta regression, and it would be possible to reuse some iterative GLM solver as a starting value for the Newton or pseudo-Newton solvers, when they don't work automatically.

larger issue: providing GLM/LEF models that can match the first two moments (mean and variance) of any other model, either to get more robust estimation by providing starting values, or by providing robust estimation that only depends on these moments instead of full specification (QMLE).

same examples again: Logit or Probit QMLE for fractional response versus Beta Regression Poisson with dispersion function versus all the special count distribution (Generalize Poisson, Generalized Negative Binomial, PIG, ..., but maybe not the two part, zero-inflated and hurdle models).

josef-pkt commented 9 years ago

another case: Poisson with approximately binomial variance function to approximate log-binomial. see #2215

Garrett M. Fitzmaurice, Stuart R. Lipsitz, Alex Arriaga, Debajyoti Sinha, Caprice Greenberg,and Atul A. Gawande Almost efficient estimation of relative risk regression Biostat (2014) 15 (4): 745-756 first published online April 4, 2014 doi:10.1093/biostatistics/kxu012 http://biostatistics.oxfordjournals.org/content/15/4/745.abstract

related prior weights in GEE #2090 difference prior weights are fixed, weights defined by variance function depend on mean (or data in general)

josef-pkt commented 9 years ago

It actually already works to change the variance attribute of the family instance.

The following Poisson regression replicates the params table of Binomial with log-link

from statsmodels.genmod.families import varfuncs
fam = family.Poisson()
fam.variance = varfuncs.Binomial(n=1)
mod_binom_fake = GLM.from_formula("lenses ~ carrot2", data2, family=fam)
res_binom_fake = mod_binom_fake.fit()
res_binom_fake.params

josef-pkt commented 8 years ago

I'm reading a bit of the R stats documentation

R glm with family quasi is pretty generic http://stat.ethz.ch/R-manual/R-patched/library/stats/html/family.html variance functions can be specified either from predefined or user function (?)

this might be a good case for unit tests. IIRC Fitzmaurice et al in earlier comment used R with this.

Also, R glm has a weights argument where the description sound like Stata's aweights, i.e. as inverse variance or standard deviation of the observations (example in R docs: a case/observation is the average over a group, so should have variance inverse proportional to group size). I didn't see frequency weights in R's glm documentation.

josef-pkt commented 6 years ago

Where is my script file or notebook?

cross-ref #3856 fixes scale and var_weights in GLM, this should be closely related, var_weights are user-provided and fixed in estimation and not directly a function of the data. variance function in family only has access to mu but not to any exog

statsmodels / statsmodels

GLM with "foreign" variance function #1777