A general interface for (penalized) linear models

alexpghayes commented 6 years ago

More and more I find myself using a bizarre mixture of lm, glm, glmnet, sandwich (HC1 standard errors), and lme4/rstanarm (mixed models) when working with a set of models that are only slightly different. If I were less lazy grpreg would also feature heavily in this mix for the group lasso implementation.

Additionally, my impression is that packages for penalized regression have widely varying interfaces. See for example:

glmnet
grpreg
penalized

I'm less interested in the precise interface for model fitting, since I already have some opinions on how that should be done, but rather on the helpers of various types for probing fit models. In particular, I think it would be useful to look at (conceptually) overloaded operators (i.e. plot methods that plot different stuff for the same model) as a place to potential uncover new modelling verbs that have been explicitly differentiated from the standard set of methods for probing models in R.

Related: what I presume to be the standard set of methods for probing models in R:

print
summary
plot
coef
residuals
predict

As a reference

methods(class="lm")
#>  [1] add1           alias          anova          case.names    
#>  [5] coerce         confint        cooks.distance deviance      
#>  [9] dfbeta         dfbetas        drop1          dummy.coef    
#> [13] effects        extractAIC     family         formula       
#> [17] hatvalues      influence      initialize     kappa         
#> [21] labels         logLik         model.frame    model.matrix  
#> [25] nobs           plot           predict        print         
#> [29] proj           qr             residuals      rstandard     
#> [33] rstudent       show           simulate       slotsFromS3   
#> [37] summary        variable.names vcov          
#> see '?methods' for accessing help and source code

alexpghayes commented 6 years ago

fitted
logLik

alexpghayes commented 6 years ago

Things that would be nice to do: likelihood ratio tests for mixed models

alexpghayes commented 6 years ago

Why don't nls and lm have the same interface? Would it make sense to have access to these both at once?

alexpghayes commented 6 years ago

Would be good to make a list of methods that models should have and define their behavior as a community standard.

alexpghayes commented 6 years ago

car::linearHypothesis anova (type I) anova (type II) anova (type III)

tomwenseleers commented 1 year ago

Well and vcov methods (calculated numerically if need be if no closed-form formula exists) and a formula interface, which would both be needed to have packages like glmnet supported by emmeans or marginaleffects (well, I think there is a formula interface if you use glmnet via caret).

alexpghayes / modelling-in-r

A general interface for (penalized) linear models #26