bbolker / broom.mixed

tidy methods for mixed models in R
227 stars 23 forks source link

broom.mixed doesn't seem to tidy brms model - bug? #143

Open pgseye opened 7 months ago

pgseye commented 7 months ago

I think I need to report this here after having difficulty exponentiating model coefficients from a brms model, as discussed:

https://github.com/ddsjoberg/gtsummary/issues/1568

bbolker commented 7 months ago

This looks like a fairly trivial bug. Can you install the devel version from here (remotes::install_github("bbolker/broom.mixed")) and see if that solves the problem? (I still need to write a test and a NEWS entry)

pgseye commented 7 months ago

Thanks Ben.

That seems to work with the small_dat dataset I put together at the first link.

Interestingly, however, when I try this on another brm (ordinal) model it throws an error:

mod8 <- brm(value_fac ~ group + (1|cohort), family = cumulative("logit"), data = dat_long)
> tbl_regression(mod8, exponentiate = T)
✖ Unable to identify the list of variables.

This is usually due to an error calling `stats::model.frame(x)`or `stats::model.matrix(x)`.
It could be the case if that type of model does not implement these methods.
Rarely, this error may occur if the model object was created within
a functional programming framework (e.g. using `lappy()`, `purrr::map()`, etc.).
bbolker commented 7 months ago

Can I please have a reproducible example? (I'm not really surprised that an ordinal model might cause problems, but it will save me time to have an example to work with ...). Also, can you double-check that the error is directly thrown by broom.mixed::tidy(mod8, exponentiate = TRUE) ?

pgseye commented 7 months ago

Thanks Ben,

Running the code below seems to suggest that broom.mixed isn't the problem?

library(brms)
library(gtsummary)
dat <- structure(list(cohort = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 
                                 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 7L, 
                                 7L, 7L, 7L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 10L, 10L, 10L, 10L, 
                                 11L, 11L, 11L, 11L, 12L, 12L, 12L, 12L, 13L, 13L, 13L, 13L, 14L, 
                                 14L, 14L, 14L, 15L, 15L, 15L, 15L, 16L, 16L, 16L, 16L, 17L, 17L, 
                                 17L, 17L), group = structure(c(1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 
                                                                1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 
                                                                1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 
                                                                1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 
                                                                1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L), levels = c("1", 
                                                                                                                            "2", "3", "4"), class = "factor"), value = c(0.597, 0.232, 0.395, 
                                                                                                                                                                         0.466, 1.132, 1.359, 0.424, NA, 1.683, 5.056, 0.879, 1.526, 3.736, 
                                                                                                                                                                         6.265, 2.64, 4.104, 1.167, 0.797, 0.72, 1.926, 0.655, NA, 0.699, 
                                                                                                                                                                         0.658, 1.437, 1.034, 1.333, 1.036, 2.021, 1.563, 1.812, 1.432, 
                                                                                                                                                                         1.211, 1.259, 2.574, 2.461, 5.062, 3.623, 2.401, 3.053, 2.518, 
                                                                                                                                                                         4.082, 3.686, 3.473, 3.201, NA, 0.682, 0.528, 0.591, 0.245, 0.675, 
                                                                                                                                                                         0.463, 0.266, 0.335, 0.636, 0.73, 0.855, 0.434, 0.095, 0.685, 
                                                                                                                                                                         1.594, 1.359, 1.362, 0.854, NA, 2.756, 2.509, 1.487)), row.names = c(NA, 
                                                                                                                                                                                                                                              -68L), class = c("tbl_df", "tbl", "data.frame"))

dat$value_fac <- factor(dat$value, ordered = T)
mod8 <- brm(value_fac ~ group + (1|cohort), family = cumulative("logit"), data = dat)
broom.mixed::tidy(mod8, exponentiate = TRUE) |> print(n = Inf)
tbl_regression(mod8, exponentiate = T)
bbolker commented 7 months ago

Yes, I think so. Although oddly it does produce a table (in the browser) in addition to the error ...

iaingallagher commented 5 months ago

Hi

I'm also having issues with broom.mixed and brms.

# libs

require(brms)
require(broom.mixed)

# data

df <- structure(list(Author = c("Jones (Male) At VT 2019", "Kilpatrick Below VT 2007", 
"Kilpatrick Above VT 2007", "Hoekstra Below VT 2017", "Niven Below VT 2018", 
"Niven Above VT 2018", "Dierkes Below VT 2021", "Bogdanis Below VT  2021", 
"Bird At VT 2019", "Rose  At VT  2012", "Rose  Above VT  2012", 
"Hargreaves  At VT 2012", "Hargreaves  Above VT 2012", "Zenko  At VT  2019", 
"Decker Below VT 2016", "Prado  Below VT  2021", "Prado  Above VT  2021", 
"Kwan  Below VT 2017"), Affect_greatest_deviation = c(0.13, 1.59, 
0.05, 2.25, 2.3, 0.7, 3.1, 1.6, 2.5, 2.29, 2.73, 2.19, 0.2, 1.79, 
1.12, 2.36, -2, 2.41), SD_GD = c(2.02, 2.22, 2.81, 1.82, 1.1, 
2.5, 0.57, 2.8, 1.25, 1.4, 1.62, 1.97, 2.73, 2.34, 1.37, 1.5, 
1.88, 1.18), Affect_baseline = c(-0.0794444444444444, -0.139444444444444, 
-0.0794444444444444, 0.790555555555556, -0.359444444444444, -0.359444444444444, 
0.210555555555556, 2.24055555555556, 1.87055555555556, 0.0105555555555559, 
0.820555555555555, -1.45944444444444, -0.409444444444444, 0.470555555555556, 
0.730555555555556, -1.45944444444444, -1.88944444444444, -0.909444444444444
), VO2max_ml_kg_min = c(-11.3838888888889, -0.683888888888895, 
-0.683888888888895, -0.0938888888888911, 12.6061111111111, 12.6061111111111, 
-4.19388888888889, 8.90611111111111, 1.47611111111111, -4.49388888888889, 
4.80611111111111, -7.99388888888889, -6.59388888888889, -6.33388888888889, 
-16.5438888888889, 6.00611111111111, 6.00611111111111, 6.58611111111111
), BMI = c(3.21508698777778, -0.144913012222222, -0.144913012222222, 
-1.88812289222222, -0.926642402222221, -0.926642402222221, -1.74491301222222, 
-2.14491301222222, -2.36491301222222, 1.15508698777778, -0.544913012222221, 
1.15508698777778, 1.65508698777778, -0.644913012222222, 9.61508698777778, 
-1.68236157222222, -1.68236157222222, -1.95491301222222), Perc_female = c(11.6433333333333, 
-18.5966666666667, -18.5966666666667, -64.5466666666667, -64.5466666666667, 
-64.5466666666667, 7.95333333333333, -14.5466666666667, -14.5466666666667, 
35.4533333333333, 35.4533333333333, 35.4533333333333, 35.4533333333333, 
-3.49666666666667, 35.4533333333333, 35.4533333333333, 35.4533333333333, 
-4.34666666666666), Age_year = c(4.54333333333334, -6.22666666666667, 
-6.22666666666667, -7.62666666666667, -5.12666666666667, -5.12666666666667, 
-3.12666666666667, -8.42666666666667, -5.95666666666666, 13.7733333333333, 
16.2733333333333, 12.3733333333333, 12.7733333333333, -4.12666666666667, 
9.12333333333333, -5.82666666666666, -5.82666666666666, -5.23666666666666
), Intensity_perc_of_VT = c(0, -15, 5, -20, -15, 5, -10, -20, 
0, 0, 5, 0, 10, 0, -10, -20, 10, -4)), row.names = c(NA, -18L
), class = c("tbl_df", "tbl", "data.frame"))

# priors

priors <- c(prior(normal(2,2), class = Intercept), 
            prior(normal(0,1), class = b), 
            prior(cauchy(0,1), class = sd, lb = 0))

# model
SEED <- 7
md <- brm(Affect_greatest_deviation|se(SD_GD) ~ 1 +
                   Intensity_perc_of_VT +
                   Affect_baseline +
                   Perc_female +
                   BMI +
                   Age_year +
                   VO2max_ml_kg_min +
                   (1|Author),
             data = df, 
             family = gaussian, 
             prior = priors,
             sample_prior = TRUE,
             chains = 4,
             iter = 5000,
             warmup = 500, 
             cores = 4,
             control = list(adapt_delta = 0.9),
             seed = 5,
             save_pars = save_pars(all = TRUE))

# broom.mixed::tidy errors out
tidy(md)

tidy errors out with

Error in x[[2]] : subscript out of bounds

Using broom.mixed_0.2.9.4, brms_2.20.1, Rcpp_1.0.10.

I've tried changing the variable names (dropping the underscores as per warning) but that has made no difference. Oddly the same model but with a nonlinear term for Intensity_perc_of_VT (3 knot restricted cubic spline) tidies fine.

Thanks for any help.

Iain