ddsjoberg / gtsummary

Presentation-Ready Data Summary and Analytic Result Tables
http://www.danieldsjoberg.com/gtsummary
Other
1.04k stars 114 forks source link

Ordinal brms regression output #935

Closed moleps closed 3 years ago

moleps commented 3 years ago

I´m trying to output coefficients from an ordinal bayesian model fittes using brms to a table, but I suspect something is wrong. I can recreate the error with the generic mtcars dataset (let alone the poor fit etc).

data(mtcars)
mtcars$cyl <- factor(mtcars$cyl,ordered = T)
mtcars$gear <- factor(mtcars$gear,ordered=F)

test <- brm(cyl~gear+hp+wt,
    family = cumulative(link = "logit", threshold = "flexible"),
    data=mtcars)

gtsummary::tbl_regression(test)

I get the same error:

_x Unable to identify the list of variables.

This is usually due to an error calling stats::model.frame(x)or stats::model.matrix(x). It could be the case if that type of model does not implement these methods. Rarely, this error may occur if the model object was created within a functional programming framework (e.g. using lappy(), purrr::map(), etc.)._

I do get a table, but I cannot exponentiate etc.

ddsjoberg commented 3 years ago

hello @moleps ! Thank you for the post!

By default, gtsummary handles these types of models with broom.mixed::tidy(). The first reason you're not getting what you're looking for is because this function is failing. I don't use this type of model. Does the error message below help you identify the issue? Does the model need a group-level effects?

If the error message isn't helpful, perhaps post the example to the broom.mixed GitHub page? What do you think?

library(brms)
mtcars$cyl <- factor(mtcars$cyl, ordered = T)
mtcars$gear <- factor(mtcars$gear, ordered = F)

mod <- 
  brms::brm(
    cyl ~ gear + hp + wt,
    family = brms::cumulative(link = "logit", threshold = "flexible"),
    data = mtcars
  )
class(mod)
#> [1] "brmsfit"

summary(mod)
#>  Family: cumulative 
#>   Links: mu = logit; disc = identity 
#> Formula: cyl ~ gear + hp + wt 
#>    Data: mtcars (Number of observations: 32) 
#> Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
#>          total post-warmup samples = 4000
#> 
#> Population-Level Effects: 
#>              Estimate Est.Error  l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
#> Intercept[1]  5035.98   4905.55     51.39 17982.80 1.96        6       17
#> Intercept[2]  6655.98   6435.57     66.07 23124.23 1.96        6       19
#> gear4         -355.53    339.22  -1163.42     0.12 1.71        6       20
#> gear5        -2835.46   2867.84 -10154.02   -19.10 1.91        6       16
#> hp              49.33     48.04      0.37   174.94 1.96        6       19
#> wt              -1.21    153.02   -464.81   290.74 1.29       77       67
#> 
#> Family Specific Parameters: 
#>      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
#> disc     1.00      0.00     1.00     1.00 1.00     4000     4000
#> 
#> Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
#> and Tail_ESS are effective sample size measures, and Rhat is the potential
#> scale reduction factor on split chains (at convergence, Rhat = 1).
#> Warning message:
#> Parts of the model have not converged (some Rhats are > 1.05). Be careful when analysing the results! We recommend running more iterations and/or setting stronger priors. 

broom.mixed::tidy(mod)
#> Error: The model does not contain group-level effects.
moleps commented 3 years ago

Hmm-- strange.

I get:

> broom.mixed::tidy(test)
# A tibble: 6 x 8
  effect component group  term  estimate std.error  conf.low conf.high
  <chr>  <chr>     <list> <chr>    <dbl>     <dbl>     <dbl>     <dbl>
1 fixed  cond      <NULL> NULL    3822.     3350.     61.9   12021.   
2 fixed  cond      <NULL> NULL    5053.     4422.     79.3   15690.   
3 fixed  cond      <NULL> NULL    -268.      259.   -893.       -0.235
4 fixed  cond      <NULL> NULL   -2047.     1859.  -6129.      -23.4  
5 fixed  cond      <NULL> NULL      36.8      32.2     0.465   114.   
6 fixed  cond      <NULL> NULL      23.1     101.   -172.      233.   

All Bayesian models don't incorporate random effects and although the group column is empty, I would have thought the term column should have worked. Is that the column picked up (among many) into gtsummary?

ddsjoberg commented 3 years ago

hey hey @larmarange , can you take a look at this example too? what do you get from broom.mixed::tidy()?

moleps commented 3 years ago

Indeed baffling that you dont get the same output. Nonetheless, I think the error message hints at the culprit, but I cannot identify the exact problem.

> broom.mixed::tidy(test)
# A tibble: 6 x 8
  effect component group  term  estimate std.error  conf.low conf.high
  <chr>  <chr>     <list> <chr>    <dbl>     <dbl>     <dbl>     <dbl>
1 fixed  cond      <NULL> NULL    3822.     3350.     61.9   12021.   
2 fixed  cond      <NULL> NULL    5053.     4422.     79.3   15690.   
3 fixed  cond      <NULL> NULL    -268.      259.   -893.       -0.235
4 fixed  cond      <NULL> NULL   -2047.     1859.  -6129.      -23.4  
5 fixed  cond      <NULL> NULL      36.8      32.2     0.465   114.   
6 fixed  cond      <NULL> NULL      23.1     101.   -172.      233.   
Warning message:
In stri_replace_first_regex(string, pattern, fix_replacement(replacement),  :
  argument is not an atomic vector; coercing
moleps commented 3 years ago

I created another model with a random intercept and broom.mixed will now pick up the terms:

test2 <- brm(cyl~(1|gear)+hp+wt,
            family = cumulative(link = "logit", threshold = "flexible"),
            data=mtcars)
> broom.mixed::tidy(test2)
# A tibble: 5 x 8
  effect   component group term            estimate std.error conf.low conf.high
  <chr>    <chr>     <chr> <chr>              <dbl>     <dbl>    <dbl>     <dbl>
1 fixed    cond      NA    (Intercept)[1]    38.9     17.8     14.5       83.8  
2 fixed    cond      NA    (Intercept)[2]    48.7     21.9     19.3      103.   
3 fixed    cond      NA    hp                 0.184    0.0983   0.0583     0.443
4 fixed    cond      NA    wt                 6.22     3.87    -0.0518    15.2  
5 ran_pars cond      gear  sd__(Intercept)    2.76     2.45     0.101      9.41 

Thus it seems that broom.mixed prefers random effects to be incorporated into the model. But all bayesian models are not random effect models and it would be preferable to also include just fixed effect models. However, when I then run tbl_regression I´m still left with an error msg:

> tbl_regression(test2)
x Unable to identify the list of variables.

This is usually due to an error calling `stats::model.frame(x)`or `stats::model.matrix(x)`.
It could be the case if that type of model does not implement these methods.
Rarely, this error may occur if the model object was created within
a functional programming framework (e.g. using `lappy()`, `purrr::map()`, etc.).

Believing it would be strange to not cater for fixed effect only bayesian models, I created two more linear models with and without random intercepts. Both these work fine, both in broom.mixed::tidy and tbl_regression providing the intended output without error messages. Thus, I suspect there is something funky with the ordinal models?

test3 <- brm(mpg~gear+hp+disp+cyl,
             data=mtcars)

test4 <- brm(mpg~gear+disp+hp+(1|cyl),
             data=mtcars)

broom.mixed::tidy(test3)
broom.mixed::tidy(test4)

tbl_regression(test3)
tbl_regression(test4) 
larmarange commented 3 years ago

@ddsjoberg and @moleps

I have done quick tests on my side.

With the first model, I got this result:

> mod %>% broom.mixed::tidy()
# A tibble: 6 x 8
  effect component group  term  estimate std.error  conf.low conf.high
  <chr>  <chr>     <list> <chr>    <dbl>     <dbl>     <dbl>     <dbl>
1 fixed  cond      <NULL> NULL    6845.     7831.     135.    29652.  
2 fixed  cond      <NULL> NULL    9080.    10318.     177.    38843.  
3 fixed  cond      <NULL> NULL    -492.      577.   -2180.       -3.91
4 fixed  cond      <NULL> NULL   -3875.     4487.  -15440.      -55.8 
5 fixed  cond      <NULL> NULL      68.3      77.4      1.08    273.  
6 fixed  cond      <NULL> NULL     -46.6     375.   -1076.      949.  
Message d'avis :
Dans stri_replace_first_regex(string, pattern, fix_replacement(replacement),  :
  argument is not an atomic vector; coercing

The term column contains only NULL values. Clearly the problem relies here in broom.mixed. Alternatively, we could inform the brms package team as the best could be that they implement their own tieder.

With the second model, it seems to work:

> test2 %>% broom.mixed::tidy()
# A tibble: 5 x 8
  effect   component group term            estimate std.error conf.low conf.high
  <chr>    <chr>     <chr> <chr>              <dbl>     <dbl>    <dbl>     <dbl>
1 fixed    cond      NA    (Intercept)[1]    41.0      21.2    15.0       91.0  
2 fixed    cond      NA    (Intercept)[2]    51.4      26.3    19.7      115.   
3 fixed    cond      NA    hp                 0.195     0.120   0.0584     0.493
4 fixed    cond      NA    wt                 6.58      4.41   -0.115     16.2  
5 ran_pars cond      gear  sd__(Intercept)    2.72      2.43    0.0988     9.73 

Now regarding this error message:

> tbl_regression(test2)
x Unable to identify the list of variables.

This is usually due to an error calling `stats::model.frame(x)`or `stats::model.matrix(x)`.
It could be the case if that type of model does not implement these methods.
Rarely, this error may occur if the model object was created within
a functional programming framework (e.g. using `lappy()`, `purrr::map()`, etc.).

broom.helpers::tody_plus_plus() (used internally by tbl_regression()) is unable to identify to which variable a term is attached due to the fact that model.matrix() method is not implemented in brms for brmsfit models. This method is crucial to identify how original data of the model (obtained through model.frame()) are transformed into terms. In particular, we use the assign attributes that indicates the relation between a term and a variable.

Due to that, a lot of enhancements of broom.helpers cannot be done with brmsfit models. However, it is not preventing tidy_plus_plus() or tbl_regression() to work and produce a result.

ddsjoberg commented 3 years ago

Thank you @larmarange !

@moleps I think the next steps for full support of the ordinal models, would be to request model.matrix() support in brms.

I am going to go ahead and close the issue for now. But if you do move forward with support request, please keep us updated.

Thanks!