Closed ddsjoberg closed 3 years ago
Dear @ddsjoberg
I guess you are referring to the discussion in #90 regarding terms disambiguation. Your suggestion to change "variable" but not "term" seems strange for me.
The thing is that with broom.mixed
all terms are not unique, this is why the authors of broom.mixed
suggests in a vignette to disambiguate terms, cf. https://cran.r-project.org/web/packages/broom.mixed/vignettes/broom_mixed_intro.html
For me, disambiguation should be done directly in the term
column, to restore unicity of that col, and this is why I was suggesting to propose a new function tidy_disambiguate_terms()
transforming "sd(intercept)" into "grade.sd(intercept)" (with a option to choose the separator and with the creation of a new column "original_term" to identify when disambiguation was performed).
At broom.helpers
level, we do not know how the tibble will be used, if results will be grouped by "variable_label" or simply listed by "term".
In tidy_plus_plus()
the disambiguation would of course be optional.
Such disambiguation should not be a problem for gtsummaty::tbl_regression()
.
In broom.mixed vignette:
the categorical or continuous predictor variables that control the expected value (i.e., enter into the linear predictor for some part of the model) are called terms (term column in tidy() output); note that unlike in base broom, the term column may have duplicated values, because the same term may enter multiple model components (e.g. zero-inflated and conditional models; models for more than one response; fixed effects and random effects)
(...)
tidy(fitted_model) %>% tidyr::unite(term, group, term) will create a new term column that’s the combination of the group and term columns (which will disambiguate random-effect terms from different grouping variables); unite(term, component, term) will disambiguate conditional and zero-inflation parameters. The code below shows a slightly more complicated (but prettier) approach. (Some sort of disambiguate_terms() function could be added in a future version of the package …)
Gotcha! Yes, that is the original issue I was referring to!
I had read the word disambiguate differently than the authors of the vignette intended. I read it as separating a single column into multiple columns rather than the opposite! From there, I wasn't reading the rest of our convo correctly....sorry about that! 👀
I was suggesting to change variable rather than the term initially because I was thinking the term and group columns should remain exactly as they come from the broom.mixed tidier function. But I think disambiguating the term column makes a lot of sense, and I am for it!
OK will prepare a PR tomorrow or the day after with a new function tidy_disambiguate_terms()
.
Hi @larmarange !
Sorry to re-visit this topic so soon. I think I didn't communicate very well last time. If possible, I think the variable name should uniquely identify the terms. To do this, we would need to combine both the
"term"
and the"group"
columns. Below are two suggestions. What do you think?Created on 2021-02-16 by the reprex package (v1.0.0)