bbolker / broom.mixed

tidy methods for mixed models in R
227 stars 23 forks source link

tidy() ignores monotonic effects in brms models #45

Open strengejacke opened 5 years ago

strengejacke commented 5 years ago

Paul has published a paper, which describes how to model monotonic effects in brms. These terms (indicated in the formula with mo()) are not shown in the output of tidy() (using broom.mixed 0.2.3). The data resp. the model does not make much sense with the income variable, this is just for demonstration purposes.

library(broom.mixed)
library(brms)

zinb <- read.csv("http://stats.idre.ucla.edu/stat/data/fish.csv")

income_options <- c("below_20", "20_to_40", "40_to_100", "greater_100")
income <- factor(sample(income_options, 250, TRUE), levels = income_options, ordered = TRUE)
zinb$income <- income

m <- brm(
  bf(count ~ child + camper + mo(income) + (1 | persons), 
     zi ~ child + camper + (1 | persons)),
  data = zinb,
  family = zero_inflated_poisson(),
  cores = 4,
  iter = 500,
  chains = 1
)

broom.mixed::tidy(m)
#> # A tibble: 8 x 8
#>   effect  component group  term      estimate std.error conf.low conf.high
#>   <chr>   <chr>     <chr>  <chr>        <dbl>     <dbl>    <dbl>     <dbl>
#> 1 fixed   cond      <NA>   (Interce~    0.998    1.15     -1.63     3.31  
#> 2 fixed   zi        <NA>   (Interce~   -0.787    0.832    -2.72     0.926 
#> 3 fixed   cond      <NA>   child       -1.13     0.0894   -1.31    -0.951 
#> 4 fixed   cond      <NA>   camper       0.637    0.0944    0.450    0.807 
#> 5 fixed   zi        <NA>   child        1.89     0.320     1.28     2.51  
#> 6 fixed   zi        <NA>   camper      -0.857    0.389    -1.58    -0.0490
#> 7 ran_pa~ cond      perso~ sd__(Int~    1.86     1.07      0.684    4.44  
#> 8 ran_pa~ zi        perso~ sd__(Int~    1.77     1.41      0.480    6.05

summary(m)
#>  Family: zero_inflated_poisson 
#>   Links: mu = log; zi = logit 
#> Formula: count ~ child + camper + mo(income) + (1 | persons) 
#>          zi ~ child + camper + (1 | persons)
#>    Data: zinb (Number of observations: 250) 
#> Samples: 1 chains, each with iter = 500; warmup = 250; thin = 1;
#>          total post-warmup samples = 250
#> 
#> Group-Level Effects: 
#> ~persons (Number of levels: 4) 
#>                  Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
#> sd(Intercept)        1.86      1.07     0.68     4.44         83 1.02
#> sd(zi_Intercept)     1.77      1.41     0.48     6.05         97 1.00
#> 
#> Population-Level Effects: 
#>              Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
#> Intercept        1.00      1.15    -1.63     3.31         79 1.00
#> zi_Intercept    -0.79      0.83    -2.72     0.93        163 1.00
#> child           -1.13      0.09    -1.31    -0.95        209 1.00
#> camper           0.64      0.09     0.45     0.81        318 1.00
#> zi_child         1.89      0.32     1.28     2.51        314 1.00
#> zi_camper       -0.86      0.39    -1.58    -0.05        234 1.00
#> moincome         0.83      0.10     0.63     1.01        151 1.00
#> 
#> Simplex Parameters: 
#>              Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
#> moincome1[1]     0.17      0.09     0.03     0.35         74 1.00
#> moincome1[2]     0.06      0.05     0.00     0.18        307 1.00
#> moincome1[3]     0.77      0.09     0.60     0.92        103 1.00
#> 
#> Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample 
#> is a crude measure of effective sample size, and Rhat is the potential 
#> scale reduction factor on split chains (at convergence, Rhat = 1).

Created on 2018-11-04 by the reprex package (v0.2.1)

strengejacke commented 5 years ago

A less complex example, taken from ?brms::mo:

income_options <- c("below_20", "20_to_40", "40_to_100", "greater_100")
income <- factor(sample(income_options, 100, TRUE), 
                 levels = income_options, ordered = TRUE)
mean_ls <- c(30, 60, 70, 75)
ls <- mean_ls[income] + rnorm(100, sd = 7)
dat <- data.frame(income, ls)

# fit a simple monotonic model
fit1 <- brm(ls ~ mo(income), data = dat, iter = 500, chains = 1)

broom.mixed::tidy(fit1)
paul-buerkner commented 5 years ago

See also https://github.com/paul-buerkner/brms/issues/543 for a brms issue I just opened.

strengejacke commented 5 years ago

And simo_mo is the prefix for the simplex parameters, right?

paul-buerkner commented 5 years ago

simo_ is the prefix for simplexes of monotonic effects

Am So., 4. Nov. 2018, 14:41 hat Daniel notifications@github.com geschrieben:

And simo_mo is the prefix for the simplex parameters, right?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/bbolker/broom.mixed/issues/45#issuecomment-435670689, or mute the thread https://github.com/notifications/unsubscribe-auth/AMVtAD3KUQdhOSpBac6QQKwee9QTVdwSks5uru6igaJpZM4YNQTL .

strengejacke commented 5 years ago

This is a bit offtopic, but as far as I understand, for non-gaussian models, I would not transform (e.g. exponentiate) the simplex parameters, as these indicate proportions of the range, which is indicated by the coefficient of the simplex term. But what would you recommend regarding this coefficient?

I've implemented support for monotonic effects in sjPlot::tab_model(), see example for the two models I posted above:

tab_model(m, fit3)

monotonic

Left is the poisson-model, right the linear model. "income" is exponentiated in the left model.

paul-buerkner commented 5 years ago

I think we should discuss this somewhere else. Perhaps on https://discourse.mc-stan.org/?

strengejacke commented 5 years ago

Yes, will do. I thought it might be semi related because some tidy methods allow to exponentiate the estimates, but it'll be too much off topic.