Closed matthieugomez closed 5 years ago
This is a very interesting idea. I assume you're restricting this to tidy outputs, typically from regressions of some form, that provide term
, estimate
etc for each coefficient (many objects don't- see glmnet_tidiers
as an example). Besides the dependent variable, it looks like stargazer extracts other information, such as what type of regression it is. It would probably have to take the model itself, either along with tidied output or with some kind of use_tidy
option.
I understand how it would interact with stargazer but not with xtable. In some sense isn't the output of tidy
already compatible with xtable (just without useful column names):
mod <- lm(mpg ~ wt, mtcars)
xtable(tidy(mod))
I actually don't really know xtable
, so I'm glad if it works out of the box !
Another reason broom would be a better input is that stargazer requires to keep the whole S3 object just to print a table, which is absurd (for instance this SO question).
A third reason is that broom
makes it easy to modify by "hand" some statistics (like df etc), or also to add new ones if columns of glance
are printed as rows at the bottom of a table.
By default however, glance
should probably add the name of the dependent variable + name of the model + number of observations (something like n.obs
).
@matthieugomez I recently ran into a similar issue as you described above. (i.e. Running into memory problems when trying to export multiple regression objects via stargazer.) In my case, a fairly elegant solution is to first convert the *lm object into a "coeftest" class using the lmtest package as I describe here.
@dgrtwo
A related issue that I noticed as a result: broom
's tidy function can take quite a while to run on large *lm objects. However, lmtest
's coeftest function works virtually instantaneously. The good news is that broom accepts coeftest objects, so I can still benefit the increased speed by using:
my_lm_object %>% lmtest::coeftest %>% broom::tidy()
.
A minor downside to this approach is that the coeftest object doesn't preserve confidence intervals (although these can obviously be calculated easily by hand if you know the sample size / degrees of freedom). However, perhaps you can still use the coeftest code to speed things up?
Speeding up tidy.lm()
seems like a reasonable goal. Do either of you have an example of a particular cumbersome calculation we could work with?
I know huxtable
uses broom as a backend, but I'm not really sure how to move forward with stargazer
and xtable
. It seems like this is really a feature request to those packages to deal nicely with broom
output? If those packages need the model name and number of observations, etc, etc in glance
output I think that's a separate discussion. I'd be happy to entertain those as a feature request, but don't think it's worth the work until it's become a limiting factor to moving forward somewhere else.
This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.
I think an interesting direction would be for xtable / stargazer to support the output of
tidy
andglance
. For now, these packages don't use multiple dispatch and they support a relatively small set of statistical commands. Buttidy
andglance
correspond exactly to the kind of operations needed to print consistently results from different models.If this is implemented, for a given S3 object, writing a
tidy
andglance
method would then directly make it compatible with xtable/stargazer.I think the only element missing to print a table from
tidy
andglance
is the name of the dependent variable.