estimate_contrasts , p-value and 95% CI do not agree

dstolz commented 1 year ago

I fit a linear mixed-effects model using lmer from lme4 with the formula:

Response ~ Age + Gender + tOrder + Condition * Scale + (1|Participant)

(mbc <- modelbased::estimate_contrasts(mdl.LMM,contrast="Condition",at="Scale"))

Level1 | Level2 | Scale | Difference |        95% CI |   SE |     df |     t |     p
L1     |     L2 |     A |       0.31 | [-0.02, 0.65] | 0.14 | 221.15 |  2.25 | 0.076
L1     |     L2 |     B |       0.19 | [-0.14, 0.53] | 0.14 | 221.75 |  1.41 | 0.320
L1     |     L2 |     C |       0.37 | [ 0.05, 0.70] | 0.14 | 221.57 |  2.75 | 0.013 
L3     |     L2 |     A |       0.18 | [-0.17, 0.53] | 0.14 | 221.39 |  1.27 | 0.414 
L3     |     L2 |     B |      -0.07 | [-0.43, 0.28] | 0.15 | 221.74 | -0.49 | 0.621 
L3     |     L2 |     C |       0.67 | [ 0.32, 1.01] | 0.14 | 221.56 |  4.66 | < .001
L3     |     L1 |     A |      -0.13 | [-0.47, 0.20] | 0.14 | 221.10 | -0.94 | 0.414 
L3     |     L1 |     B |      -0.27 | [-0.62, 0.08] | 0.15 | 221.75 | -1.84 | 0.200 
L3     |     L1 |     C |       0.29 | [-0.04, 0.63] | 0.14 | 221.50 |  2.10 | 0.037

The last row is what has me confused. I am unclear as to why the p value can be less than .05 while the 95% confidence interval includes 0. I would think that the 95% CI and the p-value should agree.

The result in the first row has a larger difference and t value, 95%CI closer to --- but still including --- 0, but has a p value > 0.05.

What might explain this apparent discrepancy between the reported p-value and 95% CI?

Or if I am misunderstanding something, could someone kindly explain why this result is correct?

Thank you in advance.

mattansb commented 1 year ago

Can you provide a reproducible example?

dstolz commented 1 year ago

Thank you for taking a look. I have added the dataset that the model was fit on and code that produces the result. Note that the level and scale labels are different in this dataset, but the result is the same.

data <- read.csv("SSQ_Subscales.csv")

FORMULA <- Response ~ Age + Gender + tOrder + Scale*Condition + (1|Participant)

# fit the model
mdl<- lmer(FORMULA,data = data,REML = T)

# identify and remove influential data
cooksd <- cooks.distance(mdl)
(influential <- as.numeric(names(cooksd)[(cooksd > 4*mean(cooksd, na.rm=T))]))  # influential row numbers
data <- subset(data, !row.names(data) %in% influential)

# fit the model again with the new data
mdl <- lmer(FORMULA,data = data,REML = T)

summary(mdl)

# estimate pairwise contrasts
(mbc <- modelbased::estimate_contrasts(mdl,contrast="Condition",at="Scale"))

R version 4.2.2 (2022-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 22621)

other attached packages:
 [1] effects_4.2-2      performance_0.10.3
 [3] interactions_1.1.5 sjPlot_2.8.14     
 [5] car_3.1-2          carData_3.0-5     
 [7] lme4_1.1-31        Matrix_1.5-1      
 [9] lubridate_1.9.2    forcats_1.0.0     
[11] stringr_1.5.0      purrr_1.0.1       
[13] readr_2.1.4        tidyr_1.3.0       
[15] tibble_3.2.1       tidyverse_2.0.0   
[17] GGally_2.1.2       dplyr_1.1.2       
[19] ggbeeswarm_0.7.2   ggridges_0.5.4    
[21] ggplot2_3.4.2      patchwork_1.1.2   

modelbased_0.8.6

SSQ_Subscales.csv

SchmidtPaul commented 1 year ago

I have not taken a close look at your specific case and forgive me if I'm wrong, but I find it likely that the issue here is a result of having different multiplicity adjustments for the p-value/test on one hand and for the confidence interval on the other hand.

As far as I know, modelbased::estimate_contrasts() is based on {emmeans} and in its documentation it says:

Multiplicity adjustments

Both tests and confidence intervals may be adjusted for simultaneous inference. Such adjustments ensure that the confidence coefficient for a whole set of intervals is at least the specified level, or to control for multiplicity in a whole family of tests. This is done via the adjust argument. For ref_grid() and emmeans() results, the default is adjust = "none". For most contrast() results, adjust is often something else, depending on what type of contrasts are created. For example, pairwise comparisons default to adjust = "tukey", i.e., the Tukey HSD method. The summary() function sometimes changes adjust if it is inappropriate.

For example, with adjust = "Tukey" the adjustment is changed to the Sidak method because the Tukey adjustment is inappropriate unless you are doing pairwise comparisons.
confint(EMM.source, adjust = "tukey")
## Note: adjust = "tukey" was changed to "sidak"
## because "tukey" is only appropriate for one set of pairwise comparisons
##  source emmean       SE df lower.CL upper.CL
##  fish   0.0337 0.000926 23   0.0313   0.0361
##  soy    0.0257 0.000945 23   0.0232   0.0281
##  skim   0.0229 0.000994 23   0.0203   0.0254
## 
## Results are averaged over the levels of: percent 
## Results are given on the inverse (not the response) scale. 
## Confidence level used: 0.95 
## Conf-level adjustment: sidak method for 3 estimates
the adjustment is changed to the Sidak method because the Tukey adjustment is inappropriate unless you are doing pairwise comparisons.
An adjustment method that is usually appropriate is Bonferroni; however, it can be quite conservative. Using adjust = "mvt" is the closest to being the “exact” all-around method “single-step” method, as it uses the multivariate t distribution (and the mvtnorm package) with the same covariance structure as the estimates to determine the adjustment. However, this comes at high computational expense as the computations are done using simulation techniques. For a large set of tests (and especially confidence intervals), the computational lag becomes noticeable if not intolerable.

For tests, adjust increases the P values over those otherwise obtained with adjust = "none".

There is also a relevant answer from Russ Lenth on StackExchange.

dstolz commented 1 year ago

If I understand correctly, the 95% CI and p-values are calculated using different approaches. The results from calling estimate_constrasts with p_adjust = "tukey" are consistent.

Thank you for the insight.

easystats / modelbased

estimate_contrasts , p-value and 95% CI do not agree #228

Multiplicity adjustments