Closed bbolker closed 6 years ago
I played around quickly with some of the glmmTMB tidiers using some code written under an older version (v. 0.0.1, 83dbbf6 or something close). I really liked the new tibble return, and I agree this is the right way to go. I did however find a few potential issues, ranked in order of concern (subjectively to my own typical use cases):
It looks like the "component" column has been dropped in tidy.glmmTMB
. This matters for zero-inflated models where it's not clear which (Intercept)
is which when both the zi and conditional components fit an intercept. Maybe also a problem for dispersion models?
> tidy(zinb_seedcons)
# A tibble: 12 x 7
effect group term estimate std.error statistic p.value
<chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 fixed NA (Intercept) -1.41 0.693 -2.04 0.0412
2 fixed NA dens_appcom:species_appcomANAR -0.00210 0.0297 -0.0708 0.944
3 fixed NA (Intercept) -1.10 0.479 -2.31 0.0212
4 fixed NA distance 0.231 0.0896 2.58 0.00982
5 ran_pars colony sd__(Intercept) 0.639 NA NA NA
6 ran_pars colony sd__(Intercept) 0.847 NA NA NA
There appears to be some unresolved renaming/cleaning of column names in the tidy.glmmTMB
function. Consider the same simple mixed model fit to estimate seed germination rates and tidy'd with
v.0.0.1 83dbbf6:
> tidy(germ_rate)
effect component term estimate std.error statistic p.value group
1 fixed cond target_spANAR -2.3461390 0.08697922 -26.973557 3.019839e-160 fixed
2 fixed cond target_spCEME -0.5088926 0.08141895 -6.250296 4.096748e-10 fixed
3 fixed cond target_spCLPU -1.7376369 0.08375916 -20.745635 1.342404e-95 fixed
4 ran_pars <NA> sd_(Intercept) 0.5525971 NA NA NA tag
v.0.2.0 7d6b62c:
> tidy(germ_rate)
# A tibble: 7 x 7
effect group term estimate std.error statistic p.value
<chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 fixed NA target_spANAR -2.35 0.0870 -27.0 3.02e-160
2 fixed NA target_spCEME -0.509 0.0814 -6.25 4.10e- 10
3 fixed NA target_spCLPU -1.74 0.0838 -20.7 1.34e- 95
4 ran_pars tag sd__(Intercept) 0.553 NA NA NA
The group
column doesn't fill correctly on rows 1-6 (or at least doesn't fill as it's documented and how I would intuitively expect it to).
It may be worthwhile to add language to the augment.glmmTMB
documentation that explains additional columns that may appear. For example, a model fit with glmmTMB(..., weights = someColumn, ...)
will add a column called X.weights.
Cleaning up these naming conventions would be better as an issue for broom
. broom.mixed
could simply offer blanket text about other common model components being included as columns.
Formatting pedantry: Looks like an extra "_" snuck in for the random effect standard error in the "term" column in tidy.glmmTMB
. Could break some folks' downstream code.
It looks like the "component" column has been dropped in tidy.glmmTMB. This matters for zero-inflated models where it's not clear which (Intercept) is which when both the zi and conditional components fit an intercept. Maybe also a problem for dispersion models?
oops, fixed.
There appears to be some unresolved renaming/cleaning of column names in the tidy.glmmTMB function. ... The
group
column doesn't fill correctly on rows 1-6 (or at least doesn't fill as it's documented and how I would intuitively expect it to).
Do you mean rows 1-3? I switched that to be NA
on purpose. I think it makes just as much (if not more) sense that "fixed" ("fixed" isn't really a group), and it makes the code slightly cleaner since we can just compose the sub-tibbles for each effect (fixed, ran_pars, ran_vals, etc.), and then rely on bind_rows()
to insert NA
values appropriately. Do you object/see potential problems? Can you remind me where it's documented so that I can fix the documentation?
It may be worthwhile to add language to the augment.glmmTMB documentation that explains additional columns that may appear. For example, a model fit with glmmTMB(..., weights = someColumn, ...) will add a column called X.weights. Cleaning up these naming conventions would be better as an issue for broom. broom.mixed could simply offer blanket text about other common model components being included as columns.
Could you add this as a separate issue?
Formatting pedantry: Looks like an extra "_" snuck in for the random effect standard error in the "term" column in tidy.glmmTMB. Could break some folks' downstream code.
This was actually on purpose. I saw this in some of the brms
term names, and thought it could make it easier, downstream, if people had single underscores in some of their variable names and wanted to split the prefixes from the variable name e.g. with strsplit()
...
I don't think there are any release blocking items, but the more consistent broom
and broom.mixed
, the better off we'll both be. Might be worth taking a quick look at the adding new tidiers vignette. At some point it'll be good to pass the modeltests
tests, but that isn't on CRAN just yet.
@alexpghayes :
broom
is more important in this case than internal (code style) compatibility ...I'm just about ready to send to CRAN ...
Sent! (I wanted to get it over with.) Let's hope it wasn't premature ...
@dmenne @dgrtwo @wpetry @JWiley @alexpghayes (I don't know if this will work for everyone)
I don't know of any release-blocking, or even release-wishlist, items, but would be happy to hear of any (and would be even happier to have people look over the package and see if there are any issues)