rvlenth / emmeans

Estimated marginal means
https://rvlenth.github.io/emmeans/
355 stars 31 forks source link

Rope for the summary() table of brms models #370

Closed Dallak closed 2 years ago

Dallak commented 2 years ago

Dear all,

I'm using this great package for calculating the marginal means for a brms model. However, I'm wondering if it possible to calculate also the rope for emmean between (-0.1, 0.1) for all predictors.

Here is an example for what I mean:

# A tibble: 20 x 7
   Predictor             Estimate Est.Error     Q2.5    Q97.5 `CI width` Rope$ROPE_Percentage
   <chr>                    <dbl>     <dbl>    <dbl>    <dbl>      <dbl>                <dbl>
 1 "Intercept"            0.222      0.0350  0.153    0.292       0.139                 0    
 2 "ini"                  0.0153     0.0239 -0.0315   0.0627      0.0943                1    
 3 "med"                  0.00468    0.0273 -0.0492   0.0591      0.108                 1    
 4 "voiced"              -0.971      0.0279 -1.03    -0.916       0.111                 0    
 5 "aa"                  -0.0657     0.0263 -0.118   -0.0139      0.104                 0.929
 6 "ii"                   0.104      0.0283  0.0477   0.160       0.112                 0.435
 7 "bilabial"            -0.0215     0.0323 -0.0860   0.0426      0.129                 1    
 8 "alveolar"             0.00238    0.0269 -0.0501   0.0571      0.107                 1    
 9 "ini × voicd"         -0.0446     0.0235 -0.0914   0.00198     0.0934                1    
10 "med × voiced"         0.0263     0.0264 -0.0259   0.0784      0.104                 1    
11 "ini × aa"            -0.0526     0.0324 -0.117    0.0116      0.128                 0.950
12 "med × aa"             0.0873     0.0370  0.0149   0.160       0.146                 0.648
13 "ini × ii"             0.0386     0.0335 -0.0269   0.105       0.132                 0.989
14 "med × ii"            -0.0448     0.0375 -0.120    0.0287      0.148                 0.955
15 "voiced × aa"          0.0873     0.0264  0.0357   0.140       0.104                 0.700
16 "voiced × ii"         -0.0841     0.0271 -0.138   -0.0307      0.107                 0.742
17 "ini × voiced × aa "  -0.0107     0.0330 -0.0757   0.0547      0.130                 1    
18 "med × voiced × aa"    0.0368     0.0367 -0.0358   0.110       0.146                 0.982
19 "ini × voiced × ii  "  0.0647     0.0333 -0.00160  0.131       0.133                 0.884
20 "med × voiced × ii"   -0.0499     0.0368 -0.123    0.0234      0.147                 0.939

This is a model summary for a brms model where rope (last column) was calculated using rope(model) from the package bayestestR. I would like to use the same thing and report the rope for the following table. Is it currently supported?

summary(r3, point.est = mean)
 poa      emmean lower.HPD upper.HPD
 bilabial  0.201    0.0944     0.301
 alveolar  0.225    0.1374     0.312
 velar     0.241    0.1614     0.321

Thanks in advance!

rvlenth commented 2 years ago

I'm thinking about this. But meanwhile, please note the issue I just posted for bayestestR: https://github.com/easystats/bayestestR/issues/559. I think there are serious problems with using rope on a model, including the example you show.

Also, I question whether ROPE is useful for your example with summary(r3), because it is usually not that interesting to test estimated means against zero. Doing so with summary(pairs(r3)) would seem to make more sense.

Dallak commented 2 years ago

Thanks @rvlenth for this input. All variables are z-scored in my data. The aim of why I'm using it is that I want to test how robust each contrast is. I know the example I provided with summary(r3, point.est = mean) is not idea (my apologies), but in my real analysis I'm using your approach, and I was able to indirectly calculate the ROPE. For my analysis, I guess the ROPE is helpful in quantifying the robustness of each contrast. Many thanks again for your time and the helpful discussion on the other post.

   Contrast   Estimate      Q2.5      Q97.5      Rope
1 fin - ini -0.4130980 -0.685896 -0.1401137 0.0000000
2 fin - med -0.0202940 -0.342101  0.3002460 0.4921786
3 ini - med  0.3934325  0.146396  0.6463180 0.0000000
rvlenth commented 2 years ago

Well, maybe you closed the issue, but I still intend to think about providing equivalence assessment in the summary of a Bayesian model. It probably will not be ROPE exactly. But Ido provide equivalence testing for frequentist models, and it would be good to not leave the Bayesian out.

Russ

Sent from my iPad

On Aug 17, 2022, at 7:06 AM, Abdulrahman Dallak @.***> wrote:



Thanks @rvlenthhttps://github.com/rvlenth for this input. All variables are z-scored in my data. The aim of why I'm using it is that I want to test how robust each contrast is. I know the example I provided with summary(r3, point.est = mean) is not idea, but in my real analysis I'm using your approach, and I was able to indirectly calculate the ROPE. For my analysis, I guess the ROPE is helpful in quantifying the robustness of each contrast. Many thanks again for your time and the helpful discussion on the other post.

Contrast Estimate Q2.5 Q97.5 Rope 1 fin - ini -0.4130980 -0.685896 -0.1401137 0.0000000 2 fin - med -0.0202940 -0.342101 0.3002460 0.4921786 3 ini - med 0.3934325 0.146396 0.6463180 0.0000000

— Reply to this email directly, view it on GitHubhttps://github.com/rvlenth/emmeans/issues/370#issuecomment-1217922068, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AGMJPLYZAM7DYXRZ6BSLFO3VZTIUTANCNFSM56SOKWAQ. You are receiving this because you were mentioned.Message ID: @.***>

rvlenth commented 2 years ago

FWIW, it seems pretty straightforward to compute ROPE areas with emmGrid objects, so long as you specify the range manually. Just use coda::as.mcmc and (ifnecessary) pick one matrix or rbind them. Here's an example, starting with an equivalence-testing example from a vignette:

library(emmeans)

### Vignette example:
### https://cran.r-project.org/web/packages/emmeans/vignettes/confidence-intervals.html
### (Near the end)

# Frequentist model

pigs.lm = lm(log(conc) ~ source + factor(percent), data = pigs)
EMMF = emmeans(pigs.lm, "source")

test(pairs(EMMF), delta = log(1.25), adjust = "none")
##  contrast    estimate     SE df t.ratio p.value
##  fish - soy    -0.273 0.0529 23   0.937  0.8209
##  fish - skim   -0.402 0.0542 23   3.308  0.9985
##  soy - skim    -0.130 0.0530 23  -1.765  0.0454
## 
## Results are averaged over the levels of: percent 
## Results are given on the log (not the response) scale. 
## Statistics are tests of equivalence with a threshold of 0.22314 
## P values are left-tailed

# Comparable Bayesian model

set.seed(123)
library(rstanarm)
## ... messages deleted

pigs.stan = stan_glm(log(conc) ~ source + factor(percent), data = pigs)
## ... messages deleted

EMMB = emmeans(pigs.stan, "source")

post = coda::as.mcmc(pairs(EMMB))
X = do.call(rbind, post)  # combine into one matrix
apply(X, 2, bayestestR::rope, range = log(1.25) * c(-1, 1))
## $`contrast fish - soy`
## # Proportion of samples inside the ROPE [-0.22, 0.22]:
## 
## inside ROPE
## -----------
## 16.45 %    
## 
## 
## $`contrast fish - skim`
## # Proportion of samples inside the ROPE [-0.22, 0.22]:
## 
## inside ROPE
## -----------
## 0.00 %     
## 
## 
## $`contrast soy - skim`
## # Proportion of samples inside the ROPE [-0.22, 0.22]:
## 
## inside ROPE
## -----------
## 97.42 %

# Straight calculation of posterior areas
apply(X, 2, \(x) mean(abs(x) < log(1.25)))
##  contrast fish - soy contrast fish - skim  contrast soy - skim 
##              0.18125              0.00150              0.95050

Created on 2022-08-17 by the reprex package (v2.0.1)

Those last results are close to what you get from rope(), except for some kind of scaling.

What I am inclined to do is allow the delta argument (that you see in the frequentist example) for Bayesian models, and if non-zero, you get the posterior probability of $Pr(|X| < \delta|$ that we see at the end of the output. That preserves a consistent interface between frequentist and Bayesian models, and gives you something darn close to what you want. Unlike bayestestR::rope(), you will need to specify the range explicitly -- but in my mind that is not a bad thing anyway. If you want the range that rope() provides, specify delta = SD/10 where SD is the SD of the response.

rvlenth commented 2 years ago

This was pretty easy to add. For the example above, I get

> test(pairs(EMMB), adjust = "none", delta=log(1.25))
 contrast    estimate lower.HPD upper.HPD p.equiv
 fish - soy    -0.273    -0.384   -0.1626  0.1812
 fish - skim   -0.403    -0.513   -0.2830  0.0015
 soy - skim    -0.129    -0.240   -0.0134  0.9505

Results are averaged over the levels of: percent 
Point estimate displayed: median 
'p.equiv' based on posterior P(|lin. pred.| < 0.2231) 
Results are given on the log (not the response) scale. 
HPD interval probability: 0.95 

> test(pairs(EMMB), adjust = "none", delta=log(1.25), type = "response")
 contrast    ratio lower.HPD upper.HPD p.equiv
 fish / soy  0.761     0.679     0.847  0.1812
 fish / skim 0.668     0.593     0.747  0.0015
 soy / skim  0.879     0.777     0.977  0.9505

Results are averaged over the levels of: percent 
Point estimate displayed: median 
'p.equiv' based on posterior P(|lin. pred.| < 0.2231) 
Results are back-transformed from the log scale 
HPD interval probability: 0.95 
Dallak commented 2 years ago

Many thanks Russell for this thorough answer! It means a lot to me. I tried your approach, and it seems working and producing what I'm looking for. Much obliged for your time and assistance.