Closed alexpghayes closed 6 years ago
Related: the many nice utilities in existing penalized regression packages how they would fit into a reimagined glm
with penalized regression niceities
fitted
redundant since predict
does the same thing when newdata = NULL
in most contexts?
Moving to main text
My thoughts up to this point have focused on a grammar of fitting models. I am increasingly interested in a grammar of interrogating models. In particular, I think that
broom
begins to provide a set of verbs for making it convenient to programmatically interrogate models.However, I think that there's a lot more to do done, especially to define the conceptual interactions you want to have with a fitted model object.
As one example, consider the workflow of something like
astsa::sarima
, where you always get a massive amount of information about: (1) convergence, (2) correct specification via residual analysis (residual ACF, QQ plot, Ljung-Box p values), (3) metrics like AIC, AICc, BIC, etc, (4) visual of the model fit. This is great to work with interactively since you immediately know if you fit a good model.On the other hand, predicting and programmatically interacting with
astsa::sarima
output is like pulling teeth because you have to recreate anastsa::sarima.for
call with the same input to forecast for example.As another example of the tension between these two approaches, consider the general landscape for linear modelling, including penalized regression packages. The output for
cv.glmnet
andglm
methods is drastically different despite researchers being interested in the same information in any cases (the coefficients for example). This suggests a number of different interrogation verbs may be necessary:rms
)run_all_of_the_diagnostics
metaverb to make interactive work convenient so you don't have to go back and forth a whole bunchpick_the_best(model1, model2, ..., modeln, metrics = "AICc")
utility that returns the best model object? or is an intermediate comparison object needed? That seems more likely.These may have both numerical and visual summaries associated with them.