tidymodels / broom

Convert statistical analysis objects from R into tidy format
https://broom.tidymodels.org
Other
1.45k stars 304 forks source link

Show estimates for random effects in lme4::glmer and lme4::lmer #223

Closed GuiMarthe closed 6 years ago

GuiMarthe commented 7 years ago

Hey folks, I just seen that using broom with the lme4 objects does not offer the possibility to inspect the estimates for random effects. Is this correct. or am I missing something?

If possible and necessary, I'd like to help develop these functions. What would I need to do? Is there some sort of developers guide for the broom package?

bbolker commented 7 years ago

Sorry to respond to this so late.

There are a lot of outstanding requests for mixed model development. There's also a lot of ambiguity about exactly what should be done: see this list, especially #38 and #96.

I've been doing a lot of work on my fork (bbolker/broom). If you're interested I can add you as a collaborator.

dgrtwo commented 7 years ago

Hello @bbolker,

I'm sorry I have been very slow to respond to issues regarding mixed model tidying, and thank you for your excellent work on the topic.

There's a proposal I've been considering for a few weeks. broom has become bloated in terms of the number of packages it tidies (e.g. it has 49 in SUGGESTS in DESCRIPTION), making it very challenging for me to maintain, especially for ones like mixed models about which I am mostly unfamiliar from a statistical perspective.

I'd like to split off development of mixed-model-related functions- at least lme4_tidiers and nlme_tidiers, as well as all the work you've done in your fork- into a separate package, on which you were the official maintainer. This is similar to what tidytext does for tidying methods related to text mining. The existing tidying methods in broom would then offer a deprecation warning (eventually an error) pointing towards your package.

This would be greatly appreciated since it would keep me from needing to serve as an (underqualified) gatekeeper, and let you and other experts develop the best methods for tidying these types of models. As a name for this package, in deference to your previous conventions, I would humbly suggest bbbroom.

If this makes sense to you in theory, please let me know so we can talk next steps (I would open a new issue for it).

bbolker commented 7 years ago

Splitting off certainly makes sense.

I don't want to call it bbbroom (I wouldn't have named bbmle that way if I'd known it would still be being used 10 years later). Maybe mixedbroom ? or broommixed? or broommm? (If broom sub-packages are called broomXXX they'll be sorted in a sensible way in package lists ...) (broomMM would be good but I agree with Hadley's advice about keeping package names all-lowercase.)

Naming, and a few other issues I have thoughts on, can/should be better discussed in a new issue you open [for my own reminder: (1) consistency/interaction with 'base broom' methods like process_lm; (2) decision-making about conventions/options for extracting different components; (3) overlap/linkage between MCMC tidiers and specifically GLMM-related MCMC tidiers ...]

dgrtwo commented 7 years ago

@bbolker Excellent, I'll follow up with a new issue and look forward to starting the process.

Final naming decision will be 100% yours, but for inspiration I've opened it up to Twitter.

bbolker commented 7 years ago

bump. Is there a new issue? I'd like to settle on a name. Design questions can probably be discussed as broad issues on the repo for the new package.

nutterb commented 7 years ago

I have a preference for broom.mixed as a package name. In general, I think keeping with a naming convention of broom.* for any break off packages makes sense.

Addendum (otherwise known as I've been thinking about this too much the past few weeks)

I rather like the idea of breaking broom up into several packages. My recommendation for it would be to have, at least, the following:

The key advantages would be that broom could be a stable package with as few dependencies as possible that primarily exists to provide the method definition. With fewer dependencies, it might be easier to get other package authors to pull some of the content out of broom.misc and into the packages that create the objects being tidied.

The obvious disadvantage is that this would murder backward compatibility.

An alternative might be to make a broom.base (which accomplishes the first bullet) and leave everything else in broom, with a Depends on broom.base. The disadvantage here is such a release would have to be coordinated with the break-off packages, or released before they are.

## Package for the primary `broom` package.
library(dplyr)
installed.packages() %>% 
  as.data.frame(stringsAsFactors = FALSE) %>% 
  filter(Priority %in% "base") %>% 
  select(Package, Priority)
bbolker commented 7 years ago

Some thoughts:

So after all that ... broom.mixed might be pretty good. @gavinsimpson: broom.gam or broom.mgcv ?

GuiMarthe commented 7 years ago

Please excuse my absence, work and college is not fun :smile: .

I think broom.mixed is a great name. However I think the naming processes of "broom tidiers spinnoffs" should be a principled decision so that it can be generalization for newer packages.

Why not settle for broom.<packagename>? I mean, the name broom.lmer4 really entails the idea of broom functions for the lme4 package.

bbolker commented 7 years ago

I agree that broom<whatever> packages should be named in a principled/generalized way.

At least in my opinion/according to my understanding, the new package is supposed to have a broader scope than lme4. There are currently tidiers for lme4, nlme, glmmADMB, glmmTMB, MCMCglmm, and brms, and there could be others - the point is that all mixed-model packages will have a similar set of issues and challenges to deal with.

I could cc: the discussion to maintainers of these packages as well (@paul-buerkner is the only one who I know is active on this repo).

nutterb commented 7 years ago

Seems like "principled/generalized" gets squishy pretty quick.

Attempting to create a broom.package for all of the packages that have tidiers would result in at least 60 new packages to add to CRAN. Maintaining that many packages could be a full time job for at least three people; I don't think it is feasible in the realm of "volunteer effort."

Another point to consider: if we were going to make a broom.package that only tidied package objects, I would find it preferable to first ask the maintainer if he or she were willing to adopt the tidiers directly into package. Putting the tidiers into a separate package probably ought to be Plan B (and I think @dgrtwo has made a similar statement in the past...yup, here). Even then, it would be preferable to put it into a package with other tidiers where it can be actively maintained.

However, in cases like mixed models, there are several model types that--as @bbolker has observed-- have similar challenges and will likely benefit from a shared code base.

GuiMarthe commented 7 years ago

Then, at least by the looks of it, we have a pretty hard case in favor of broom.mixed. So far, the list of packages that broom.mixed should support is:

Am I missing anything?

paul-buerkner commented 7 years ago

I think the rstanarm tidier is missing (?)

Am 19.09.2017 17:57 schrieb "Guilherme Marthe" notifications@github.com:

Then, at least by the looks of it, we have a pretty hard case in favor of broom.mixed. So far, the list of packages that broom.mixed should support is:

Am I missing anything?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/tidyverse/broom/issues/223#issuecomment-330586119, or mute the thread https://github.com/notifications/unsubscribe-auth/AMVtABM5aO6NUSk0VZapO9WBKjDEVqdEks5sj-RTgaJpZM4NYoiC .

bbolker commented 7 years ago

possibly: R-INLA (although it might be a nightmare); MASS::glmmPQL (might need to be handled separately); ordinal; blme (may already be handled by lme4 tidiers?); gamm4 (in mixed.gam?); spamm; coxme ...

But I don't think we necessarily need to decide now. We can collect the most common ones and then have an open issue for requests for package coverage.

hughjonesd commented 7 years ago

Excuse me for being late to this party. As a consumer of broom, I really hope that decentralization doesn't weaken the key value, which is to get information about regressions and tests in a predictable form. In particular, it is good to have predictable column names. So, I would prefer decentralization with a relatively strong base package, which provides constraints on what tidy and friends will return.

nutterb commented 7 years ago

In a decentralized model, it is really hard to enforce this, but we already have https://github.com/tidyverse/broom#conventions as a resource for how to name columns.

bbolker commented 6 years ago

I believe the particular issue that started this thread is resolved now: see https://github.com/bbolker/broom.mixed/issues/1 ... also, encourage further discussion relating to broom.mixed to move to https://github.com/bbolker/broom.mixed/issues/ ...

alexpghayes commented 6 years ago

Closing in favor of the issues linked above if the issue hasn't already been resolved.

github-actions[bot] commented 3 years ago

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.