Open jseabold opened 12 years ago
A request for RLM weights. It looks like you can compare to MATLAB (?).
I just read this a few days ago
Carroll, Raymond J., and David Ruppert. "Robust estimation in heteroscedastic linear models." The annals of statistics (1982): 429-441.
There are also articles for RLM, M-estimation, with AR(1) and with spatial errors.
So far I don't know what (prior) heteroscedasticity weights would mean in discrete models and the same models in GLM.
to check what matlab has: robust option in curvefit http://www.mathworks.com/help/curvefit/least-squares-fitting.html#bq_5kr9-4 and robust regression without a weights options (wfun is our norms M) http://www.mathworks.com/help/stats/robustfit.html
GLM https://groups.google.com/d/msg/pystatsmodels/QtSH8T47pZg/KYwJCrxD3eYJ Stata and SAS use weights for loglikeobs w_i * loglike_i Stata poisson only mentions fweights and pweights (and iweights), but doesn't have aweights. Stata glm also has aweights but not clear how it's used
more on robust: Some papers use weighted likelihood to discount influential observations, x outliers Trimmed MLE uses 0-1 weights for loglike to cut outliers. (same as subset selection in this case).
to the last point: importance weights for Poisson and GLM, question on stackoverflow http://stackoverflow.com/questions/28951982/using-weightings-in-a-poisson-model-using-statsmodels-module
GEE has weights, #2090
a stackoverflow question asking for weights in GLM or Logit to compensate for imbalanced sample http://stackoverflow.com/questions/31661552/statsmodels-python-weighted-glm This might be similar to inverse probability weights #2443 #2442 in the interpretation.
also related: using the variance function in GLM to introduce weights and heteroscedasticity #1777
another similar question on stack overflow (imbalanced sample in Logit) http://stackoverflow.com/questions/33605979/statsmodels-logistic-regression-class-imbalance (by now I figured out caseweights in GLM Binomial a bit better)
I'm opening issue specific to rare events, unbalanced sample.
Make sure weights are correctly handled throughout the models. This includes GLM, RLM, ANOVA, and the discrete choice models. I think it also might make sense to have weights objects. It might also be interesting to see how far we can get with those provided by PySAL, but I haven't spoken with their developers since the summer. Many of their estimators are just duplicating ours. We should make it easy for them to use our code.