Closed wlandau closed 7 months ago
As an exercise, it would be great to code 1 or 2 examples from first principles and compare them to output from emmeans
.
@wlandau This package may be helpful to see how margins are calculated in brms.
Looks like brmsmargins
puts a lot of effort into the correct handling of random effects. Fortunately, we only have fixed effects. But the first section on "Integrating out Random Effects" does have clues about how to treat fixed effects by using the model matrix.
Using a simple MMRM with no covariates, it is straightforward to use the model matrix to derive draws of marginal means directly. Agreement with emmeans
is spot on in this case.
suppressPackageStartupMessages({
library(brms)
library(coda)
library(dplyr)
library(posterior)
library(tibble)
library(tidyr)
})
custom_marginal_draws <- function(model, formula, data) {
# Get the draws of the fixed effects.
draws <- model %>%
as_draws_df() %>%
as_tibble() %>%
select(starts_with("b_"), -starts_with("b_sigma"))
# Derive the model matrix (with the correct row order).
stan_data <- make_standata(formula = formula, data = data)
brms_permutation <- match(x = stan_data$Y, table = data$response)
undo_brms_permutation <- match(x = data$response, table = stan_data$Y)
stopifnot(all(stan_data$Y[undo_brms_permutation] == data$response))
model_matrix <- as.data.frame(stan_data$X[undo_brms_permutation, ])
# Take the marginal means of the model matrix (by column).
colnames(model_matrix) <- paste0("x_", colnames(model_matrix))
joined <- dplyr::bind_cols(data, model_matrix)
grid <- joined %>%
group_by(group, time) %>%
summarize(across(starts_with("x_"), mean), .groups = "drop")
reference <- select(grid, -starts_with("x_"))
transformation <- grid %>%
select(starts_with("x_")) %>%
as.matrix() %>%
t()
colnames(transformation) <- paste0(grid$group, "_", grid$time)
# Convert draws of fixed effects to draws of marginal means.
out <- as.matrix(draws) %*% transformation
colnames(out) <- paste0(grid$group, "_", grid$time)
out
}
data <- tibble(
group = rep(c("treatment", "placebo"), each = 100L),
patient = paste0("patient_", seq_len(200L))
) %>%
expand_grid(time = paste0("time_", seq_len(4L))) %>%
mutate(response = rnorm(nrow(.)))
formula <- brmsformula(
formula = response ~ 0 + group*time + unstr(time = time, gr = patient),
sigma ~ 0 + time
)
model <- brm(formula = formula, data = data, refresh = 0, chains = 1)
emmeans::emm_options(sep = "_")
# Automatic from the brms/emmeans integration:
draws_emmeans <- emmeans::emmeans(
object = model,
specs = ~group:time,
wt.nuis = "proportional",
nuisance = "patient"
) %>%
as.mcmc(fixed = TRUE, names = FALSE) %>%
as_draws_df() %>%
as.data.frame() %>%
as_tibble() %>%
select(-starts_with("."))
# Custom draws using the model matrix:
draws_custom <- custom_marginal_draws(
model = model,
formula = formula,
data = data
)[, colnames(draws_emmeans)]
# Discrepancy betwen the two approaches:
max(abs(as.matrix(draws_custom) - as.matrix(draws_emmeans)))
#> [1] 0
Above, I derived a transformation
matrix to control the mapping between draws of fixed effects (rows) and draws of the marginal means we want (columns).
> transformation
placebo_time_1 placebo_time_2 placebo_time_3 placebo_time_4 treatment_time_1 treatment_time_2 treatment_time_3 treatment_time_4
x_groupplacebo 1 1 1 1 0 0 0 0
x_grouptreatment 0 0 0 0 1 1 1 1
x_timetime_2 0 1 0 0 0 1 0 0
x_timetime_3 0 0 1 0 0 0 1 0
x_timetime_4 0 0 0 1 0 0 0 1
x_grouptreatment:timetime_2 0 0 0 0 0 1 0 0
x_grouptreatment:timetime_3 0 0 0 0 0 0 1 0
x_grouptreatment:timetime_4 0 0 0 0 0 0 0 1
I will try and see if this same approach works with covariates.
Starting with the FEV dataset and filtering out missing observations, I get the same result when there are no extra covariates:
suppressPackageStartupMessages({
library(brms)
library(brms.mmrm)
library(coda)
library(dplyr)
library(posterior)
library(tibble)
library(tidyr)
})
custom_marginal_draws <- function(model, formula, data) {
# Get the draws of the fixed effects.
draws <- model %>%
as_draws_df() %>%
as_tibble() %>%
select(starts_with("b_"), -starts_with("b_sigma"))
# Derive the model matrix (with the correct row order).
stan_data <- make_standata(formula = formula, data = data)
brms_permutation <- match(x = stan_data$Y, table = data$change)
undo_brms_permutation <- match(x = data$change, table = stan_data$Y)
stopifnot(all(stan_data$Y[undo_brms_permutation] == data$change))
model_matrix <- as.data.frame(stan_data$X[undo_brms_permutation, ])
# Take the marginal means of the model matrix (by column).
colnames(model_matrix) <- paste0("x_", colnames(model_matrix))
joined <- dplyr::bind_cols(data, model_matrix)
grid <- joined %>%
group_by(ARMCD, AVISIT) %>%
summarize(across(starts_with("x_"), mean), .groups = "drop")
reference <- select(grid, -starts_with("x_"))
transformation <- grid %>%
select(starts_with("x_")) %>%
as.matrix() %>%
t()
colnames(transformation) <- paste0(grid$ARMCD, "_", grid$AVISIT)
# Convert draws of fixed effects to draws of marginal means.
out <- as.matrix(draws) %*% transformation
colnames(out) <- paste0(grid$ARMCD, "_", grid$AVISIT)
out
}
# Use the FEV1 dataset from {mmrm}
data(fev_data, package = "mmrm")
data <- fev_data %>%
mutate(change = FEV1 - FEV1_BL) %>%
filter(!is.na(change))
# formula <- brmsformula(
# formula = change ~ FEV1_BL + RACE + SEX + WEIGHT + ARMCD*AVISIT +
# unstr(time = AVISIT, gr = USUBJID),
# sigma ~ 0 + AVISIT
# )
formula <- brmsformula(
formula = change ~ ARMCD*AVISIT +
unstr(time = AVISIT, gr = USUBJID),
sigma ~ 0 + AVISIT
)
model <- brm(formula = formula, data = data, refresh = 0, chains = 1)
emmeans::emm_options(sep = "_")
# Automatic from the brms/emmeans integration:
draws_emmeans <- emmeans::emmeans(
object = model,
specs = ~ARMCD:AVISIT,
wt.nuis = "proportional",
nuisance = c("USUBJID")#, "FEV1_BL", "RACE", "SEX", "WEIGHT")
) %>%
as.mcmc(fixed = TRUE, names = FALSE) %>%
as_draws_df() %>%
as.data.frame() %>%
as_tibble() %>%
select(-starts_with("."))
# Custom draws using the model matrix:
draws_custom <- custom_marginal_draws(
model = model,
formula = formula,
data = data
)[, colnames(draws_emmeans)]
# Discrepancy betwen the two approaches:
print(max(abs(as.matrix(draws_custom) - as.matrix(draws_emmeans))))
#> [1] 0
Disagreements start to happen when I add a simple continuous fixed effect like FEV1_BL
. I am not sure why.
suppressPackageStartupMessages({
library(brms)
library(coda)
library(dplyr)
library(posterior)
library(tibble)
library(tidyr)
})
custom_marginal_draws <- function(model, formula, data) {
# Get the draws of the fixed effects.
draws <- model %>%
as_draws_df() %>%
as_tibble() %>%
select(starts_with("b_"), -starts_with("b_sigma"))
# Derive the model matrix (with the correct row order).
stan_data <- make_standata(formula = formula, data = data)
brms_permutation <- match(x = stan_data$Y, table = data$change)
undo_brms_permutation <- match(x = data$change, table = stan_data$Y)
stopifnot(all(stan_data$Y[undo_brms_permutation] == data$change))
model_matrix <- as.data.frame(stan_data$X[undo_brms_permutation, ])
# Take the marginal means of the model matrix (by column).
colnames(model_matrix) <- paste0("x_", colnames(model_matrix))
joined <- dplyr::bind_cols(data, model_matrix)
grid <- joined %>%
group_by(ARMCD, AVISIT) %>%
summarize(across(starts_with("x_"), mean), .groups = "drop")
reference <- select(grid, -starts_with("x_"))
transformation <- grid %>%
select(starts_with("x_")) %>%
as.matrix() %>%
t()
colnames(transformation) <- paste0(grid$ARMCD, "_", grid$AVISIT)
# Convert draws of fixed effects to draws of marginal means.
out <- as.matrix(draws) %*% transformation
colnames(out) <- paste0(grid$ARMCD, "_", grid$AVISIT)
out
}
# Use the FEV1 dataset from {mmrm}
data(fev_data, package = "mmrm")
data <- fev_data %>%
mutate(change = FEV1 - FEV1_BL) %>%
filter(!is.na(change))
# formula <- brmsformula(
# formula = change ~ FEV1_BL + RACE + SEX + WEIGHT + ARMCD*AVISIT +
# unstr(time = AVISIT, gr = USUBJID),
# sigma ~ 0 + AVISIT
# )
formula <- brmsformula(
formula = change ~ FEV1_BL + ARMCD*AVISIT +
unstr(time = AVISIT, gr = USUBJID),
sigma ~ 0 + AVISIT
)
model <- brm(formula = formula, data = data, refresh = 0, chains = 1)
emmeans::emm_options(sep = "_")
# Automatic from the brms/emmeans integration:
draws_emmeans <- emmeans::emmeans(
object = model,
specs = ~ARMCD:AVISIT,
wt.nuis = "proportional",
nuisance = c("USUBJID", "FEV1_BL")#, "RACE", "SEX", "WEIGHT")
) %>%
as.mcmc(fixed = TRUE, names = FALSE) %>%
as_draws_df() %>%
as.data.frame() %>%
as_tibble() %>%
select(-starts_with("."))
# Custom draws using the model matrix:
draws_custom <- custom_marginal_draws(
model = model,
formula = formula,
data = data
)[, colnames(draws_emmeans)]
# Discrepancy betwen the two approaches:
print(max(abs(as.matrix(draws_custom) - as.matrix(draws_emmeans))))
#> [1] 1.262745
This could be a case of average marginal effects (AME) vs marginal effects at the mean (MEM). In continuous RHS variables it is more complicated (and even more so when doing interactions) than compared to discrete.
With the settings above, emmeans
apparently sets each continuous nuisance covariate equal to its grand mean when computing marginal means. Indeed, if I have a model with non-interacting continuous covariates:
formula <- brmsformula(
formula = change ~ FEV1_BL + WEIGHT + ARMCD*AVISIT +
unstr(time = AVISIT, gr = USUBJID),
sigma ~ 0 + AVISIT
)
and custom_marginal_draws()
puts the grand means in the model matrix:
model_matrix[, "FEV1_BL"] <- mean(model_matrix[, "FEV1_BL"])
model_matrix[, "WEIGHT"] <- mean(model_matrix[, "WEIGHT"])
then I see a discrepancy of exactly 0.
Your point about interactions, especially interactions between continuous and discrete variables, is a good one. This gets complicated very quickly.
In a much simpler case with a non-interacting factor with two levels, I am having trouble replicating what emmeans
does. https://cran.r-project.org/web/packages/emmeans/vignettes/messy-data.html#nuisance claims that nuisance factors are averaged out, but I am not sure what kind of average is actually used. With a straightforward mean of each column of the model matrix, I get a strong discrepancy. Same for the median.
suppressPackageStartupMessages({
library(brms)
library(coda)
library(dplyr)
library(posterior)
library(tibble)
library(tidyr)
})
custom_marginal_draws <- function(model, formula, data) {
# Get the draws of the fixed effects.
draws <- model %>%
as_draws_df() %>%
as_tibble() %>%
select(starts_with("b_"), -starts_with("b_sigma"))
# Derive the model matrix (with the correct row order).
stan_data <- make_standata(formula = formula, data = data)
brms_permutation <- match(x = stan_data$Y, table = data$change)
undo_brms_permutation <- match(x = data$change, table = stan_data$Y)
stopifnot(all(stan_data$Y[undo_brms_permutation] == data$change))
model_matrix <- as.data.frame(stan_data$X[undo_brms_permutation, ])
# Average over the SEX covariate (nuisance variable).
model_matrix[, "SEXFemale"] <- mean(model_matrix[, "SEXFemale"])
# Take the marginal means of the model matrix (by column).
colnames(model_matrix) <- paste0("x_", colnames(model_matrix))
joined <- dplyr::bind_cols(data, model_matrix)
grid <- joined %>%
group_by(ARMCD, AVISIT) %>%
summarize(across(starts_with("x_"), mean), .groups = "drop")
reference <- select(grid, -starts_with("x_"))
transformation <- grid %>%
select(starts_with("x_")) %>%
as.matrix() %>%
t()
colnames(transformation) <- paste0(grid$ARMCD, "_", grid$AVISIT)
# Convert draws of fixed effects to draws of marginal means.
out <- as.matrix(draws) %*% transformation
colnames(out) <- paste0(grid$ARMCD, "_", grid$AVISIT)
out
}
# Use the FEV1 dataset from {mmrm}
data(fev_data, package = "mmrm")
data <- fev_data %>%
mutate(change = FEV1 - FEV1_BL) %>%
filter(!is.na(change))
formula <- brmsformula(
formula = change ~ SEX + ARMCD*AVISIT +
unstr(time = AVISIT, gr = USUBJID),
sigma ~ 0 + AVISIT
)
model <- brm(formula = formula, data = data, refresh = 0, chains = 1)
# Automatic from the brms/emmeans integration:
emmeans::emm_options(sep = "_")
draws_emmeans <- emmeans::emmeans(
object = model,
specs = ~ARMCD:AVISIT,
wt.nuis = "proportional",
nuisance = c("USUBJID", "SEX")
) %>%
as.mcmc(fixed = TRUE, names = FALSE) %>%
as_draws_df() %>%
as.data.frame() %>%
as_tibble() %>%
select(-starts_with("."))
# Custom draws using the model matrix:
draws_custom <- custom_marginal_draws(
model = model,
formula = formula,
data = data
)[, colnames(draws_emmeans)]
# Discrepancy betwen the two approaches:
print(max(abs(as.matrix(draws_custom) - as.matrix(draws_emmeans))))
#> [1] 12.48768
I wonder if this is a problem with the brms
/emmeans
integration. When I just use a simple frequentist model, the discrepancy goes away.
library(emmeans)
library(tidyverse)
# Use the FEV1 dataset from {mmrm}
data(fev_data, package = "mmrm")
data <- fev_data %>%
mutate(change = FEV1 - FEV1_BL) %>%
filter(!is.na(change))
formula <- change ~ SEX + ARMCD*AVISIT
model <- lm(formula, data = data)
# {emmeans} marginals
marginals_emmeans <- emmeans::emmeans(
object = model,
specs = ~ARMCD:AVISIT,
wt.nuis = "proportional",
nuisance = c("USUBJID", "SEX")
) %>%
as.data.frame() %>%
as_tibble() %>%
mutate(marginal = paste0(ARMCD, "_", AVISIT)) %>%
select(marginal, emmean)
# Create a mapping from model parameters to marginal means,
# attempting to pre-average over nuisance variables:
model_matrix <- model.matrix(formula, data = data)
model_matrix[, "SEXFemale"] <- mean(model_matrix[, "SEXFemale"]) # nuisance variable
colnames(model_matrix) <- paste0("x_", colnames(model_matrix))
joined <- dplyr::bind_cols(data, model_matrix)
grid <- joined %>%
group_by(ARMCD, AVISIT) %>%
summarize(across(starts_with("x_"), mean), .groups = "drop")
mapping <- grid %>%
select(starts_with("x_")) %>%
as.matrix()
colnames(mapping) <- gsub("^x_", "", colnames(mapping))
rownames(mapping) <- paste0(grid$ARMCD, "_", grid$AVISIT)
print(mapping)
#> (Intercept) SEXFemale ARMCDTRT AVISITVIS2 AVISITVIS3 AVISITVIS4
#> PBO_VIS1 1 0.5363128 0 0 0 0
#> PBO_VIS2 1 0.5363128 0 1 0 0
#> PBO_VIS3 1 0.5363128 0 0 1 0
#> PBO_VIS4 1 0.5363128 0 0 0 1
#> TRT_VIS1 1 0.5363128 1 0 0 0
#> TRT_VIS2 1 0.5363128 1 1 0 0
#> TRT_VIS3 1 0.5363128 1 0 1 0
#> TRT_VIS4 1 0.5363128 1 0 0 1
#> ARMCDTRT:AVISITVIS2 ARMCDTRT:AVISITVIS3 ARMCDTRT:AVISITVIS4
#> PBO_VIS1 0 0 0
#> PBO_VIS2 0 0 0
#> PBO_VIS3 0 0 0
#> PBO_VIS4 0 0 0
#> TRT_VIS1 0 0 0
#> TRT_VIS2 1 0 0
#> TRT_VIS3 0 1 0
#> TRT_VIS4 0 0 1
# Use the mapping to transform model coefficients to marginal means:
custom <- mapping %*% coefficients(model)
marginals_custom <- tibble::tibble(
marginal = rownames(custom),
custom = as.numeric(custom)
)
# Comparison
comparison <- dplyr::left_join(
marginals_emmeans,
marginals_custom,
by = "marginal"
) %>%
mutate(difference = custom - emmean)
print(comparison)
#> # A tibble: 8 × 4
#> marginal emmean custom difference
#> <chr> <dbl> <dbl> <dbl>
#> 1 PBO_VIS1 -8.12 -8.12 0
#> 2 TRT_VIS1 -2.06 -2.06 0
#> 3 PBO_VIS2 -3.33 -3.33 0
#> 4 TRT_VIS2 1.73 1.73 0
#> 5 PBO_VIS3 2.45 2.45 0
#> 6 TRT_VIS3 5.32 5.32 0
#> 7 PBO_VIS4 8.20 8.20 0
#> 8 TRT_VIS4 13.0 13.0 0
Created on 2024-03-19 with reprex v2.1.0
The discrepancy seems like a brms
issue, c.f. https://github.com/paul-buerkner/brms/issues/1630.
One path forward would be to replicate frequentist emmeans
ourselves, once we understand how emmeans
handles interactions in both the discrete and the continuous case.
We might also consider just not copying the idea of "nuisance" parameters at all. From https://cran.r-project.org/web/packages/emmeans/vignettes/messy-data.html#nuisance, nuisance parameters are a convenience/efficiency issue, not an accuracy issue. If I can do more experiments in the frequentist case with models using interactions, and show how to replicate everything using custom code, I think it will be clear how to adapt this approach for our posterior samples.
The section titled "Emphasis on experimental data" is useful:
To start off with, we should emphasize that the underpinnings of estimated marginal means – and much of what the emmeans package offers – relate more to experimental data than to observational data. In observational data, we sample from some population, and the goal of statistical analysis is to characterize that population in some way. In contrast, with experimental data, the experimenter controls the environment under which test runs are conducted, and in which responses are observed and recorded. Thus with experimentation, the population is an abstract entity consisting of potential outcomes of test runs made under conditions we enforce, rather than a physical entity that we observe without changing it.
We say this because the default behavior of the emmeans() function is to average groups together with equal weights; this is common in analysis of experiments, but not common in analysis of observational data;
So from what I gather:
emmeans(weights = "equal")
gives each level of each factor equal representation.emmeans(weights = "cells")
takes into account how frequently each factor level occurs in the rows of the data.Excluding interactions, I can reproduce both cases above. The minor discrepancies below are small enough to be rounding errors.
library(emmeans)
library(tidyverse)
# Use the FEV1 dataset from {mmrm}
data(fev_data, package = "mmrm")
data <- fev_data %>%
mutate(change = FEV1 - FEV1_BL) %>%
filter(!is.na(change))
formula <- change ~ FEV1_BL + WEIGHT + RACE + SEX + ARMCD*AVISIT
model <- lm(formula, data = data)
# {emmeans} marginals
marginals_emmeans <- emmeans::emmeans(
object = model,
specs = ~ARMCD:AVISIT,
weights = "equal"
) %>%
as.data.frame() %>%
as_tibble() %>%
mutate(marginal = paste0(ARMCD, "_", AVISIT)) %>%
select(marginal, emmean)
# Create a mapping from model parameters to marginal means,
# attempting to pre-average over nuisance variables:
model_matrix <- model.matrix(formula, data = data)
model_matrix[, "FEV1_BL"] <- mean(model_matrix[, "FEV1_BL"])
model_matrix[, "WEIGHT"] <- mean(model_matrix[, "WEIGHT"])
model_matrix[, "SEXFemale"] <- 0.5
model_matrix[, "RACEBlack or African American"] <- 1 / 3
model_matrix[, "RACEWhite"] <- 1 / 3
colnames(model_matrix) <- paste0("x_", colnames(model_matrix))
joined <- dplyr::bind_cols(data, model_matrix)
grid <- joined %>%
group_by(ARMCD, AVISIT) %>%
summarize(across(starts_with("x_"), mean), .groups = "drop")
mapping <- grid %>%
select(starts_with("x_")) %>%
as.matrix()
colnames(mapping) <- gsub("^x_", "", colnames(mapping))
rownames(mapping) <- paste0(grid$ARMCD, "_", grid$AVISIT)
# Use the mapping to transform model coefficients to marginal means:
custom <- mapping %*% coefficients(model)
marginals_custom <- tibble::tibble(
marginal = rownames(custom),
custom = as.numeric(custom)
)
# Comparison
comparison <- dplyr::left_join(
marginals_emmeans,
marginals_custom,
by = "marginal"
) %>%
mutate(difference = custom - emmean)
print(comparison)
#> # A tibble: 8 × 4
#> marginal emmean custom difference
#> <chr> <dbl> <dbl> <dbl>
#> 1 PBO_VIS1 -6.98 -6.98 -5.33e-15
#> 2 TRT_VIS1 -3.11 -3.11 -5.33e-15
#> 3 PBO_VIS2 -2.22 -2.22 -5.33e-15
#> 4 TRT_VIS2 1.84 1.84 -5.33e-15
#> 5 PBO_VIS3 3.50 3.50 -5.33e-15
#> 6 TRT_VIS3 6.23 6.23 -5.33e-15
#> 7 PBO_VIS4 8.26 8.26 -5.33e-15
#> 8 TRT_VIS4 12.3 12.3 -5.33e-15
Created on 2024-03-19 with reprex v2.1.0
library(emmeans)
library(tidyverse)
# Use the FEV1 dataset from {mmrm}
data(fev_data, package = "mmrm")
data <- fev_data %>%
mutate(change = FEV1 - FEV1_BL) %>%
filter(!is.na(change))
formula <- change ~ FEV1_BL + WEIGHT + RACE + SEX + ARMCD*AVISIT
model <- lm(formula, data = data)
# {emmeans} marginals
marginals_emmeans <- emmeans::emmeans(
object = model,
specs = ~ARMCD:AVISIT,
weights = "cells"
) %>%
as.data.frame() %>%
as_tibble() %>%
mutate(marginal = paste0(ARMCD, "_", AVISIT)) %>%
select(marginal, emmean)
# Create a mapping from model parameters to marginal means,
# attempting to pre-average over nuisance variables:
model_matrix <- model.matrix(formula, data = data)
model_matrix[, "FEV1_BL"] <- mean(model_matrix[, "FEV1_BL"])
model_matrix[, "WEIGHT"] <- mean(model_matrix[, "WEIGHT"])
# model_matrix[, "SEXFemale"] <- 0.5
# model_matrix[, "RACEBlack or African American"] <- 1 / 3
# model_matrix[, "RACEWhite"] <- 1 / 3
colnames(model_matrix) <- paste0("x_", colnames(model_matrix))
joined <- dplyr::bind_cols(data, model_matrix)
grid <- joined %>%
group_by(ARMCD, AVISIT) %>%
summarize(across(starts_with("x_"), mean), .groups = "drop")
mapping <- grid %>%
select(starts_with("x_")) %>%
as.matrix()
colnames(mapping) <- gsub("^x_", "", colnames(mapping))
rownames(mapping) <- paste0(grid$ARMCD, "_", grid$AVISIT)
# Use the mapping to transform model coefficients to marginal means:
custom <- mapping %*% coefficients(model)
marginals_custom <- tibble::tibble(
marginal = rownames(custom),
custom = as.numeric(custom)
)
# Comparison
comparison <- dplyr::left_join(
marginals_emmeans,
marginals_custom,
by = "marginal"
) %>%
mutate(difference = custom - emmean)
print(comparison)
#> # A tibble: 8 × 4
#> marginal emmean custom difference
#> <chr> <dbl> <dbl> <dbl>
#> 1 PBO_VIS1 -7.79 -7.79 1.78e-15
#> 2 TRT_VIS1 -3.20 -3.20 2.66e-15
#> 3 PBO_VIS2 -2.86 -2.86 1.78e-15
#> 4 TRT_VIS2 1.73 1.73 1.78e-15
#> 5 PBO_VIS3 2.89 2.89 2.66e-15
#> 6 TRT_VIS3 6.37 6.37 1.78e-15
#> 7 PBO_VIS4 7.67 7.67 8.88e-15
#> 8 TRT_VIS4 12.4 12.4 1.78e-15
Created on 2024-03-19 with reprex v2.1.0
Even in an experimental setting, I do not think it is a good idea to give equal weights to all the levels of a factor.
For example: in an Alzheimer's study, it is common to regress on baseline disease severity. Those levels might be "mild" and "severe". Suppose the experiment has mostly "mild" patients and only a small number of "severe" ones. In the equal weighting scheme, there will be a marginal mean for "mild" and another marginal mean for "severe", and these means will be straightforwardly averaged together. But if there are far more "mild" patients than "severe" ones, this scheme risks misrepresenting the composition of the overall patient population. In particular, if the drug is only effective in the mild subpopulation, then equal weighting will artificially dampen the treatment effect.
Another example: we often regress on the choice of investigative site in multi-center studies. The site shouldn't matter for inference, we just want to explain variability in the model. If one or two sites have the most patients, and most sites have a very small number of patients, what would make the most sense to do? It seems more intuitive to me to let the number of patients in each site determine how the sites are weighted, as opposed to averaging all sites equally.
I think all this would be worth discussing in our group meeting tomorrow.
https://stats.stackexchange.com/questions/510862/is-least-squares-means-lsmeans-statistical-nonsense/510923#510923 is edifying, and it helps me understand more about why estimated marginal means with equal weights are used, as well as their limitations (e.g. interactions).
From that post:
Is this useful? Definitely yes, when the model is additive and/or the interaction effects are small. They mimic the types of analyses proposed for balanced experiments by the giants among our forebears. If, on the other hand, we do have sizeable interaction effects, then (as our forebears recommend) we simply should not even consider estimating marginal means. (In a multi-factor experiment, it may still be reasonable to construct some marginal means where we are averaging only over negligible interaction effects).
brms.mmrm
has interaction terms which may be highly significant, both in comparisons of interest and in blocking variables.
For some Bayesian models, I have implemented "cell weighting" for all variables. The idea is to first generate patient-level predictions y*_ijk
from the model (arm i, time j, patient k), then calculate an estimated mean m_ij= (1 / K_i) Sum_{k = 1}^{K_i} y*_ijk
. I think this approach may be what we want for uninteresting covariates like WEIGHT
and RACE
which were not controlled by the study design. But we should give more thought to the weighting scheme for the levels of treatment group and time point. Equal weighting for treatment and time sounds nice until we remember that we may have interactions.
Equal weighting for treatment and time sounds nice until we remember that we may have interactions.
In the simple cases of both formula <- change ~ ARMCD + AVISIT
and formula <- change ~ ARMCD*AVISIT
, I get the exact same perfect agreement whether I use weights = "equal"
or weights = "cells"
. In fact, weights = "invalid"
appears to work too in these simplest cases because weights
is not actually used. That makes me think cell-weighting could work for all variables.
With just treatment and time, I am getting perfect agreement even when the data is severely unbalanced:
Similar story for lots of uninteresting covariates without interactions (except there is rounding error). This one uses weights = "equal"
By commenting out the custom averaging for categorical levels, I can reproduce the same outcome as emmeans(weights = "cells")
:
More follow-up: I can reproduce emmeans
, even in a severely unbalanced design, when the interesting interactions do not overlap with the uninteresting ones. Below is a reprex with change ~ FEV1_BL + WEIGHT + SEX*RACE + ARMCD*AVISIT
, which has both kinds of interactions:
On the other hand, when interactions are mixed as with SEX*ARMCD*VISIT
, emmeans
produces an ominous NOTE from emmeans
:
library(emmeans)
library(tidyverse)
data(fev_data, package = "mmrm")
data <- fev_data %>%
mutate(change = FEV1 - FEV1_BL) %>%
filter(!is.na(change))
formula <- change ~ SEX*ARMCD*AVISIT
model <- lm(formula, data = data)
out <- emmeans::emmeans(
object = model,
specs = ~ARMCD:AVISIT,
weights = "equal",
nuisance = "USUBJID"
)
#> NOTE: Results may be misleading due to involvement in interactions
Created on 2024-03-19 with reprex v2.1.0
We are in this exact situation with subgroup analysis. So moving to a custom fitted-value-based might allow us to do analyses that would be invalid for the current emmeans
-integrated version of brms.mmrm
.
Above, I managed to reproduce emmeans(weights = "cells")
and emmeans(weights = "equal")
. From this comment by the author of emmeans
, it looks like these are two extremes:
The extremes are "equal" (the default), which is best for estimating the effects of one factor while holding the rest fixed in an experimental mode; and "cells", which in a linear model, basically reproduces the raw marginal means of the data but is least useful for quantifying effects and most vulnerable to confounding with effects of other factors.
On the one hand, this makes "cells"
seem inappropriate for a clinical trial. On the other, not all factors are controlled in the experiment, so "equal"
is not a great fit either.
brms.mmrm
currently uses "proportional"
, which according to ?emmeans
is:
Weight in proportion to the frequencies (in the original data) of the factor combinations that are averaged over.
as opposed to "cell"
:
Weight according to the frequencies of the cells being averaged.
I do not understand the explanation of "proportional"
. At this point, I think it's worth spending time to figure out what "proportional"
and "outer"
really mean.
...and as soon as I posted that comment, I think I figured out what "proportional" means. Here is a reprex:
The key here is that whereas weights = "equal"
does this:
model_matrix[, "SEXFemale"] <- 0.5
model_matrix[, "RACEBlack or African American"] <- 1 / 3
model_matrix[, "RACEWhite"] <- 1 / 3
weights = "proportional" does this:
model_matrix[, "SEXFemale"] <- mean(model_matrix[, "SEXFemale"])
model_matrix[, "RACEBlack or African American"] <- mean(model_matrix[, "RACEBlack or African American"])
model_matrix[, "RACEWhite"] <- mean(model_matrix[, "RACEWhite"])
Furthermore, I can start to see how "cell" weights would confound the interesting marginal means with uninteresting covariates, leading to the Simpsons paradox situations that Russ Lenth describes. I am starting to come around to "proportional" because I know what it does now, and I like how it corrects from the balancing problem of weights = "equal"
. (Plus, brms.mmrm
currently sets weights = "proportional"
in emmeans()
).
Same reprex with an interaction between uninteresting variables:
My only concern is when we have subgroup interactions with the treatment arm and visit ("NOTE: Results may be misleading due to involvement in interactions" as mentioned previously). We may need to require subgroup to be part of the reference grid when there is a subgroup/treatment and/or subgroup/treatment/time interaction in the model. I still haven't wrapped my head around that part yet.
I will need to carefully read http://cran.nexr.com/web/packages/emmeans/vignettes/interactions.html, from at a glance:
emmeans() valiantly tries to warn you that it may not be a good idea to average over factors that interact with the factor of interest. It isn’t always a bad idea to do this, but sometimes it definitely is.
If that's the issue, then we just need to identify the factors of interest and avoid averaging over other factors that interact with them. We'll have to work out what those are in the case of subgroup and non-subgroup.
Great points came out of today's meeting:
Thanks @wlandau , this investigation is really helpful !
To the 3rd bullet point about dropout... I realized this has other interesting implications about what emmeans is doing for main effects of "uninteresting" covariates as you've called them. For example, you found that emmeans is doing something like:
mean(model_matrix[, "FEV1_BL"])
This is not actually the average baseline FEV, it is an average of that covariate over all the patient-visits appearing in the design matrix, making it a weighted average in which patients who drop out earlier carry less weight. I doubt this is the standard practice.
IMO when computing the reference grid, we should first complete the grid of patient-visits to include the missing ones. Otherwise the "margin" at which we report the mean response is an odd weighted average of the study population.
@wlandau here is that ref paper I was referring to in our meeting. It is more focused on transformations of the marginal when you have non-linear link functions. Thankfully in our case we dont go into those corners.
I was thinking about the crux of our conversation yesterday that we need a more careful manual approach than depending on emmeans
out of the box. I actually think that would be a mistake, we should be standing on the shoulders of the years of experience emmeans
has instead of redoing their work. Understanding how their functions work and what different levers of the formals in functions do is obviously needed and you have done a great job digging into that.
One thing that may be a good middle ground is to use the emmGrid object inside emmeans
to define specific grids that we want to template and use and then allow emmeans
to deploy those grids. We could also take adv of the documentation on extending emmeans to our "model" type.
emtrends may also be a good solution for our interaction issue. it looks like the developers have already been through this mess :)
To recap #40: let's say our model is CHG ~ BASE + AVISIT * TRT01P
, but we want Sebastian's prior on the differences in CHG
between consecutive AVISIT
s. If we were to analytically transform the prior to align with the formula, then we would get a multivariate normal with a correlated parameters. That's a good thing because the prior can express our initial belief about the time course of the disease. However, as I believe @andrew-bean observed earlier, for a prior with correlated fixed effects, we would need to inject custom Stan code through the family
argument of brms::brm()
. I am not a fan of that approach because it seems like a hack/backdoor, and I think it would decrease our ability to trust and properly maintain the implementation.
Instead, I was hoping we could:
CHG
differences.CHG
differences.emmeans
or its underlying method to convert the parameter samples from (4) into samples of marginal means.From my understanding of conditional means priors, the math collapses down to these convenient linear transformations if our link function is linear. Admittedly, we would be fitting the model on change differences instead of the parameterization that the user requested in the formula. But if the transformation is linear and full-rank, then those parameter spaces should be exactly the same in a linear algebraic sense, so the overall result of the model should be the same.
I doubt emmeans
alone would handle (5) because the model object from (3) does not have TRT01P
or AVISIT
as explicit variables. As far as I am aware, emmeans
requires the model object itself to directly incorporate the factors that define the reference grid. One workaround could be to define a custom lm()
, then as you suggested @yonicd, use emmGrid()
to get the reference grid. I think that would connect all the dots.
As far as completely custom vs the middle ground, the emmeans
"supersession plan" makes me nervous:
The developer of emmeans continues to maintain and occasionally add new features. However, none of us is immortal; and neither is software. I have thought of trying to find a co-maintainer who could carry the ball once I am gone or lose interest, but the flip side of that is that the codebase is not getting less messy as time goes on -- why impose that on someone else? So my thought now is that if at some point, enough active R developers want the capabilities of emmeans but I am no longer in the picture, they should feel free to supersede it with some other package that does it better. All of the code is publicly available on GitHub, so just take what is useful and replace what is not.
emmeans
has so many reverse dependencies on CRAN, and internally, it is used in topline analysis models in real clinical trials. I hope this will motivate a new maintainer or an equivalent package when the time comes, but I am not so sure. Maybe for us, we go for the middle ground for now, then switch to something else or a custom method if emmeans
is ever abandoned. In either case, I think we should understand what it does every step of the way.
Also: I will have to check, but emmeans
might not be able to do the carefully fine-tuned kind of averaging we talked about:
As far as I know, weights = "proportional"
applies to everything that is not in specs
. If we have to supply custom numeric weights
to emmeans
anyway, I think we might as well go fully custom.
It may be the case that "proportional" is just shorthand for a emmGrid setup. We can probably construct complex grid templates.
Yes, weights = "proportional"
looks like shorthand. From the documentation of the weights
argument from ?emmeans::emmeans
:
Also, if object is not already a reference grid, weights (if it is character) is passed to ref_grid as wt.nuis in case nuisance factors are specified.
And from ?emmeans::ref_grid
:
If nuisance is a vector of predictor names, those predictors are omitted from the reference grid. Instead, the result will be as if we had averaged over the levels of those factors, with either equal or proportional weights as specified in wt.nuis.
I tried to dig further into the actual computation of marginal means from the fixed effect estimates. Some aspects I think I understand, but others are buried too deep inside code that was not meant to be readable. (Details in the collapsible block below.)
In any case, ?emm_basis
says:
A user or package developer may add emmeans support for a model class by writing recover_data and emm_basis methods for that class.
So it seems we are encouraged to write our own "basis" algorithm, which includes the actual matrix that maps fixed effects to marginal means. This is the crux of the whole process, and in his vignettes and posts, Russ Lent himself encourages users to think carefully about the details of EMM calculation. All this makes me think a custom approach would not be too reckless.
There would be value in actually writing an actual emm_basis()
S3 method if we wanted users to call emmeans()
on special classed brms.mmrm
output, but this is not part of the interface, so I don't think we need it.
There is also a .basis.nuis which actually computes the weights, but somehow it relies on the mysterious rows appended to the end of the model matrix, and that apparently is the mechanism of non-"equal" averaging methods such as "proportional".
Digging even more, these extra appended rows come from .setup.nuis()
. They are only used to translate the averaging weights to the model matrix in .basis.nuis()
, and then .basis.nuis()
removes those rows right away.
The following details show just how weights = "proportional"
is a simple calculation buried deep in layers upon layers of messy code. This stuff is also direct confirmation that emmeans
cannot proportionally average within levels of a subgroup, and it cannot use proportional averaging for only a subset of nuisance variables. A custom approach will allow us to do both, as well as tailor the nuances of interactions to our use case.
Regarding:
emtrends may also be a good solution for our interaction issue. it looks like the developers have already been through this mess :)
Does this fit our use case? From the description of the function:
The emtrends function is useful when a fitted model involves a numerical predictor x interacting with another predictor a (typically a factor). Such models specify that x has a different trend depending on a; thus, it may be of interest to estimate and compare those trends.
and from the documentation of the var
argument:
In order for this to be useful, var should be a numeric predictor that interacts with at least one factor in specs.
By contrast, BASELINE:AVISIT
is a "nuisance" interaction for brms.mmrm
. We only care about AVISIT:TREATMENT
and AVISIT:TREATMENT:SUBGROUP
.
Looking back at:
One thing that may be a good middle ground is to use the emmGrid object inside emmeans to define specific grids that we want to template and use and then allow emmeans to deploy those grids.
@yonicd, would you elaborate on what would go into the template and what would be left to emmeans
to do?
and it cannot use proportional averaging for only a subset of nuisance variables.
I was hoping a custom averaging method might allow us to average over interaction terms, but now I think I was wrong about that. From the emmeans
interactions vignette:
it may not be a good idea to average over factors that interact with the factor of interest. It isn’t always a bad idea to do this, but sometimes it definitely is.
and in the summary:
One must resist pressures and inclinations to try to produce simple bottom-line conclusions.
The author makes these statements in the general sense: regardless of the averaging method used ("proportional"
or otherwise) and regardless of whether emmeans
is used. The whole message of the "Interacting factors" section is that we should understand the modeled interactions (e.g. with interaction plots) before marginalizing them out post-hoc.
In brms.mmrm
, there are several potential sources of trouble:
baseline_time
argument of brm_formula()
).use_subgroup = FALSE
in brm_marginal_draws()
).brm_marginal_draws()
to a brms
model not fit with brms.mmrm
.I think we should completely disable (4) and (5). (For (4), the model comparisons in https://openpharma.github.io/brms.mmrm/articles/subgroup.html#visualization can simply condition on a single subgroup level rather than marginalize over all the levels.)
I think we should still allow (1)-(3) for modeling, but prevent brm_marginal_draws()
from post-processing models with these interactions. This allows people to still model the extra interactions and compute their own summaries.
All this makes the marginal mean calculation in brm_marginal_draws()
much simpler. I propose:
A. Complete the missingness grid as @andrew-bean recommended in https://github.com/openpharma/brms.mmrm/issues/53#issuecomment-2010543475. B. Proportionally average over all uninteresting factors. C. By default, predict at the mean of each numeric variable. D. As @yonicd suggested in the last meeting, allow users to pick custom reference levels for (C).
I will start working on separate pull requests the guardrails (1-5) and marginal means (A-D).
@andrew-bean, would you help me understand the rationale for averaging covariates within subgroup levels? This approach sounded nice during the last meeting, but now I worry that it might cause the exact same confounding/Simpson's paradox as weights = "cells"
.
I was thinking about the crux of our conversation yesterday that we need a more careful manual approach than depending on emmeans out of the box. I actually think that would be a mistake, we should be standing on the shoulders of the years of experience emmeans has instead of redoing their work.
emmeans
has indeed spent years refining subtle edge cases and unknown unknowns, but with the guardrails in https://github.com/openpharma/brms.mmrm/issues/53#issuecomment-2021126575, I think brms.mmrm
will only encounter the simplest cases. Proportional averaging over non-interacting factors is easy, and we can code it more cleanly and transparently than emmeans
.
Proportional averaging over non-interacting factors is easy
And from reading "Interacting Factors", emmeans
is not going to help us marginalize out interactions anyway. It would be unwise to recommend it in the general case, with or without emmeans
.
Hmm... interacting continuous covariates are different from interacting factors. From the vignette:
When a covariate and a factor interact, we typically don’t want EMMs themselves, but rather estimates of slopes of the covariate trend for each level of the factor.
For those interactions with baseline, it seems emtrends()
may be more relevant than I originally thought.
I just submitted #83 to get us started. This one just has the most straightforward guardrails: avoid marginalizing over the subgroup, and make sure the model we're summarizing was actually fit with brms.mmrm
.
@andrew-bean, I forgot to mention: I know my earlier reprexes didn't do this, but brm_data()
already fills in rows for implicitly missing responses. So if our marginal mean calculations have at least some degree of customization, we can cover https://github.com/openpharma/brms.mmrm/issues/53#issuecomment-2010543475 easily.
Good news: I figured out how to reproduce emmeans
in the presence of baseline-time interactions! It's really easy and intuitive: in the model matrix, let the FEV1_BL
interaction terms equal mean(FEV1_BL)
only for their nonzero elements. model.matrix()
takes care of this automatically. A reprex is in the "Details" block:
Given how trivial this is, I don't think we need emtrends()
. And we can safely keep all those interactions that involve baseline.
I did find something strange though: emmeans
removes all the rows with missing values before it does its calculation. Even when I "fill" these rows beforehand, emmeans
removes them. This leads to a different mean(FEV1_BL)
, which gives a different result. To align with emmeans
, I had to manually remove rows with missing responses. I think a custom approach is a great opportunity to change this, c.f. https://github.com/openpharma/brms.mmrm/issues/53#issuecomment-2010543475
I did find something strange though: emmeans removes all the rows with missing values before it does its calculation. Even when I "fill" these rows beforehand, emmeans removes them.
If we wanted to stick with emmeans
, we could avoid this problem by temporarily setting the missing responses in the data to something non-missing.
Given how simple this all turned out to be, I would be open to separating the issue of replicating emmeans
from the issue of taking manual control over the reference grid. The former is not urgent, and it will be easy to do the latter using emmeans
.
Sorry was out of pocket for a bit.
What I meant re leveraging {emmeans} was three parts.
If it isnt too much of twisting ourselves into pretzel shape to work within the {emmeans} framework I think there are definite advantages to it.
Fortunately, brms.mmrm
already integrates emmeans
through brms
. In fact, this is what brm_marginal_draws()
currently uses to get posterior draws of marginal means (c.f. here and here in the code.)
Let's say a user fits a model with brms.mmrm
:
library(tidyverse)
library(brms.mmrm)
data(fev_data, package = "mmrm")
data <- fev_data %>%
mutate("FEV1_CHG" = FEV1 - FEV1_BL) %>%
brm_data(
outcome = "FEV1_CHG",
role = "change",
group = "ARMCD",
time = "AVISIT",
patient = "USUBJID",
baseline = "FEV1_BL",
reference_group = "PBO",
covariates = c("RACE", "SEX")
)
formula <- brm_formula(
data = fev_data,
intercept = TRUE,
baseline = TRUE,
group = TRUE,
time = TRUE,
baseline_time = TRUE,
group_time = TRUE,
correlation = "unstructured"
)
print(formula)
#> FEV1_CHG ~ FEV1_BL + FEV1_BL:AVISIT + ARMCD + ARMCD:AVISIT + AVISIT + RACE + SEX + unstr(time = AVISIT, gr = USUBJID)
#> sigma ~ 0 + AVISIT
model <- brm_model(data = data, formula = formula)
Because the output is a brms
model fit, the user can already leverage emmeans
without extra work from us. This is true regardless of whether we use emmeans
or a custom solution to get our own marginal means in brm_maginal_draws()
.
emmeans(
model,
specs = ~ARMCD:AVISIT,
nuisance = c("RACE", "SEX")
)
#> ARMCD AVISIT emmean lower.HPD upper.HPD
#> PBO VIS1 -26.05 -30.5 -21.23
#> TRT VIS1 -22.01 -25.7 -18.30
#> PBO VIS2 -21.26 -24.9 -17.40
#> TRT VIS2 -17.25 -20.6 -13.54
#> PBO VIS3 -15.62 -19.3 -12.15
#> TRT VIS3 -12.60 -16.1 -9.08
#> PBO VIS4 -10.87 -14.8 -6.27
#> TRT VIS4 -6.42 -10.2 -1.88
#>
#> Results are averaged over the levels of: 2 nuisance factors
#> Point estimate displayed: median
#> HPD interval probability: 0.95
ok, great!
Then i think the only thing would be to have robust documentation on how brm_maginal_draws
differ from direct use of {emmeans}
. Which is basically collating the comments in this issue into a vignette. You have done an amazing job doing the deep dive into emmeans
Thanks! Yes, I am planning to either add to https://openpharma.github.io/brms.mmrm/articles/methods.html or write a separate vignette to explain the whole calculation from first principles, both what emeans
does and what we do differently. The only differences I foresee are:
weights = "cells"
.The custom approach is as easy as I thought, and it will make for a nice readable vignette:
library(brms)
library(posterior)
library(tibble)
# Apply proportional averaging to nuisance factors.
proportional_factors <- brmsformula(FEV1_CHG ~ 0 + SEX + RACE) %>%
make_standata(data = data) %>%
.subset2("X") %>%
colMeans() %>%
t()
print(proportional_factors)
#> SEXMale SEXFemale RACEBlackorAfricanAmerican RACEWhite
#> [1,] 0.47 0.53 0.375 0.275
# Construct a reference grid with interesting factors and the
# means of numeric variables.
grid <- data %>%
mutate(FEV1_BL = mean(FEV1_BL), FEV1_CHG = 0) %>%
distinct(ARMCD, AVISIT, FEV1_BL, FEV1_CHG)
print(grid)
#> # A tibble: 8 × 4
#> ARMCD AVISIT FEV1_BL FEV1_CHG
#> <chr> <chr> <dbl> <dbl>
#> 1 PBO VIS1 40.2 0
#> 2 PBO VIS2 40.2 0
#> 3 PBO VIS3 40.2 0
#> 4 PBO VIS4 40.2 0
#> 5 TRT VIS1 40.2 0
#> 6 TRT VIS2 40.2 0
#> 7 TRT VIS3 40.2 0
#> 8 TRT VIS4 40.2 0
# Get the draws of the model parameters and align the column names
# with the mapping.
draws_parameters <- model %>%
as_draws_df() %>%
as_tibble() %>%
select(starts_with("b_"), -starts_with("b_sigma")) %>%
print(head(draws_parameters))
#> b_Intercept b_FEV1_BL b_ARMCDTRT b_AVISITVIS2 b_AVISITVIS3 b_AVISITVIS4 b_RACEBlackorAfricanAmerican b_RACEWhite b_SEXFemale b_FEV1_BL:AVISITVIS2 b_FEV1_BL:AVISITVIS3 b_FEV1_BL:AVISITVIS4 b_ARMCDTRT:AVISITVIS2 b_ARMCDTRT:AVISITVIS3 b_ARMCDTRT:AVISITVIS4
#> [1,] 28.12049 -0.9328434 3.482917 -0.9652279 3.501594 16.134916 1.3795105 4.566933 0.59740456 0.150796399 0.171189552 -0.03216017 0.20417157 -0.4202866 1.29446708
#> [2,] 29.96148 -0.9703349 4.143874 -1.9744873 6.878381 11.069597 1.7366256 5.587970 -0.61139642 0.161982927 0.075364839 0.13104548 0.03335486 -1.1919374 -0.51673961
#> [3,] 28.04120 -0.9167395 5.028615 0.7848824 8.312772 7.970771 2.2455201 4.290807 -0.87864402 0.114844584 0.049232866 0.16065948 -0.61784967 -1.7812204 1.04428585
#> [4,] 27.94835 -0.9296524 2.613377 -0.6260921 5.259358 7.593719 0.2306152 5.926638 1.58860856 0.119618736 0.115816445 0.18186057 1.31263326 -0.3219806 0.54245828
#> [5,] 24.42265 -0.8568065 3.547295 6.0542027 10.727494 12.397730 1.7849814 5.238166 0.01330715 -0.005420136 0.008823189 0.06752929 -0.47946851 -0.3092742 -0.02700641
#> [6,] 23.31398 -0.8203084 3.128613 5.1667634 9.398452 8.762476 1.6365735 5.486014 0.91430527 -0.020211246 0.022233059 0.15643241 1.18707182 -0.2188128 -1.08372968
# Create the mapping from model parameters to marginal means.
# Make sure to align the column names with those of draws_parameters.
mapping <- brmsformula(FEV1_CHG ~ FEV1_BL + FEV1_BL:AVISIT + ARMCD + ARMCD:AVISIT + AVISIT) %>%
make_standata(data = grid) %>%
.subset2("X") %>%
bind_cols(proportional_factors) %>%
setNames(paste0("b_", colnames(.)))
stopifnot(all(colnames(draws_parameters) %in% colnames(mapping)))
mapping <- as.matrix(mapping)[, colnames(draws_parameters)]
rownames(mapping) <- paste(grid$ARMCD, grid$AVISIT)
print(mapping)
#> b_Intercept b_FEV1_BL b_ARMCDTRT b_AVISITVIS2 b_AVISITVIS3 b_AVISITVIS4 b_RACEBlackorAfricanAmerican b_RACEWhite b_SEXFemale b_FEV1_BL:AVISITVIS2 b_FEV1_BL:AVISITVIS3 b_FEV1_BL:AVISITVIS4 b_ARMCDTRT:AVISITVIS2 b_ARMCDTRT:AVISITVIS3 b_ARMCDTRT:AVISITVIS4
#> PBO VIS1 1 40.19072 0 0 0 0 0.375 0.275 0.53 0.00000 0.00000 0.00000 0 0 0
#> PBO VIS2 1 40.19072 0 1 0 0 0.375 0.275 0.53 40.19072 0.00000 0.00000 0 0 0
#> PBO VIS3 1 40.19072 0 0 1 0 0.375 0.275 0.53 0.00000 40.19072 0.00000 0 0 0
#> PBO VIS4 1 40.19072 0 0 0 1 0.375 0.275 0.53 0.00000 0.00000 40.19072 0 0 0
#> TRT VIS1 1 40.19072 1 0 0 0 0.375 0.275 0.53 0.00000 0.00000 0.00000 0 0 0
#> TRT VIS2 1 40.19072 1 1 0 0 0.375 0.275 0.53 40.19072 0.00000 0.00000 1 0 0
#> TRT VIS3 1 40.19072 1 0 1 0 0.375 0.275 0.53 0.00000 40.19072 0.00000 0 1 0
#> TRT VIS4 1 40.19072 1 0 0 1 0.375 0.275 0.53 0.00000 0.00000 40.19072 0 0 1
# Transform model parameters to marginal means.
draws_marginal_means <- draws_parameters %*% t(mapping) %>%
as.data.frame() %>%
as_tibble()
colnames(draws_marginal_means) <- paste(grid$ARMCD, grid$AVISIT)
print(draws_marginal_means)
#> # A tibble: 4,000 × 8
#> `PBO VIS1` `PBO VIS2` `PBO VIS3` `PBO VIS4` `TRT VIS1` `TRT VIS2` `TRT VIS3` `TRT VIS4`
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 -7.28 -2.19 3.10 7.56 -3.80 1.50 6.16 12.3
#> 2 -7.17 -2.64 2.73 9.16 -3.03 1.54 5.69 12.8
#> 3 -7.25 -1.85 3.04 7.18 -2.22 2.56 6.29 13.3
#> 4 -6.86 -2.68 3.06 8.05 -4.24 1.25 5.35 11.2
#> 5 -7.90 -2.06 3.19 7.22 -4.35 1.01 6.42 10.7
#> 6 -7.05 -2.69 3.24 8.00 -3.92 1.62 6.15 10.0
#> 7 -7.56 -2.35 3.19 8.41 -5.35 -0.0937 6.84 10.5
#> 8 -7.07 -1.66 2.70 9.97 -2.17 0.632 6.26 14.6
#> 9 -7.90 -4.11 3.55 8.21 -3.76 1.34 4.81 11.4
#> 10 -8.81 -4.60 3.03 7.59 -3.55 2.03 5.94 11.0
#> # ℹ 3,990 more rows
#> # ℹ Use `print(n = ...)` to see more rows
By contrast, if we were to use emmeans
directly, then we would need to call emm_basis()
manually. As far as I can tell, this is the only way to get the equivalent of the mapping
matrix above. It should be emm_basis()$X
, but when I try it, it is giving me the wrong answer. I think it's because the grid
argument needs work. ?emm_basis
says it comes from ref_grid()
, and ?brms::emmeans-brms-helpers
has no useful information. All this feels more cryptic and error-prone than I would like it to be.
custom_xlev <- lapply(select(data, AVISIT, ARMCD, SEX, RACE), function(x) unique(as.character(x)))
custom_trms <- terms(x = ~ FEV1_BL + FEV1_BL:AVISIT + ARMCD + ARMCD:AVISIT + AVISIT + RACE + SEX, data = data)
custom_grid <- ref_grid(
model,
specs = ~ARMCD:AVISIT,
nuisance = c("RACE", "SEX")
)
emm_basis(
object = model,
trms = custom_trms,
xlev = custom_xlev,
grid = custom_grid, # I get the same result if I supply custom_grid@grid.
resp = "FEV1_CHG"
)$X
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
#> [1,] 1 0 0 0 0 0 0 0
#> [2,] 0 1 0 0 0 0 0 0
#> [3,] 0 0 1 0 0 0 0 0
#> [4,] 0 0 0 1 0 0 0 0
#> [5,] 0 0 0 0 1 0 0 0
#> [6,] 0 0 0 0 0 1 0 0
#> [7,] 0 0 0 0 0 0 1 0
#> [8,] 0 0 0 0 0 0 0 1
A few thoughts on older comments already from last week:
Even in an experimental setting, I do not think it is a good idea to give equal weights to all the levels of a factor.
For example: in an Alzheimer's study, it is common to regress on baseline disease severity. Those levels might be "mild" and "severe". Suppose the experiment has mostly "mild" patients and only a small number of "severe" ones. In the equal weighting scheme, there will be a marginal mean for "mild" and another marginal mean for "severe", and these means will be straightforwardly averaged together. But if there are far more "mild" patients than "severe" ones, this scheme risks misrepresenting the composition of the overall patient population. In particular, if the drug is only effective in the mild subpopulation, then equal weighting will artificially dampen the treatment effect.
I agree. In practice I have seen the "observed margins" option (in SAS) applied most frequently, which usually seemed very sensible in clinical trial settings.
It may be helpful to derive a strategy for marginal means estimation also with the estimand in mind. Very generally speaking, one would commonly be interested in some "average treatment effect in the population of trial participants". In such cases it seems to make sense that the "margins" reflect the observed distributions in the sample. Sometimes one might be interested in generalizing effects to a certain target population. Then a weighting based on external criteria would be needed.
Subgroups: "observed margins" usually relate to the margins in the total sample. I am not sure these are always best for subgroup analyses too (e.g. for sex subgroups with different baseline values, where both sex and baseline are prognostically relevant), but even then I believe they commonly provide a reasonable reference.
I can also imagine that it makes a difference whether the interest lies (more) on estimates of the marginal means, e.g. for a certain treatment arm or subgroup, or on their contrasts. In the case of the latter it seems critical to keep an eye on interpretability of this contrast. One may also want a certain simplicity here and avoid creating (difficult to interpret) imbalances.
- For subgroup models, @andrew-bean suggested proportional averaging within each level of the subgroup, rather than over the whole dataset.
I am not sure in such case the resulting contrast of the marginal means is what one is looking for. Usually we may want to keep everything fixed at certain level and vary only the treatment when interpreting the contrast.
Another example: we often regress on the choice of investigative site in multi-center studies. The site shouldn't matter for inference, we just want to explain variability in the model. If one or two sites have the most patients, and most sites have a very small number of patients, what would make the most sense to do? It seems more intuitive to me to let the number of patients in each site determine how the sites are weighted, as opposed to averaging all sites equally.
Center effects - also an interesting topic. If heterogeneity w.r.t. treatment effects is assumed, the center variable in an MMRM would also be involved in interactions (with treatment and/or visit) ...but on the other hand, you wouldn't be interested in making inferences about center effects...(?) Hhm, if there were many sites with many patients, this sounds like a justification for mixed model with G- and R-side random effects (I have often wondered what an application for such a model might be; out of scope here, I know).
https://stats.stackexchange.com/questions/510862/is-least-squares-means-lsmeans-statistical-nonsense/510923#510923 is edifying, and it helps me understand more about why estimated marginal means with equal weights are used, as well as their limitations (e.g. interactions).
+1 ...yes, ...and even though experimental, the longitudinal design with the need for interactions (with time/ visit) makes the marginals means more challenging, both conceptually and technically....
c.f. https://github.com/RConsortium/brms.mmrm/issues/51#issuecomment-1671146686. I started a vignette at https://rconsortium.github.io/brms.mmrm/articles/methods.html with the goal of explaining every part of our method, and it is sparse on how
emmeans
transforms posterior samples of model coefficients into marginal means. I understand theemmeans
docs and https://www.jstatsoft.org/article/view/v069i01 correctly, the current behavior seems reasonable:https://github.com/RConsortium/brms.mmrm/blob/0fee37475e317af31d36791d336e18b402bd9ac0/R/brm_marginal_draws.R#L109-L114
I think this means each marginal mean is an average over the predictions at a given combination of study arm and time point, where continuous nuisance variables are set to the mean and categorical nuisance variables are averaged in a weighted way according to how levels are represented in the data. I think it would be good to dig into the code, e.g. https://github.com/rvlenth/emmeans/blob/master/R/emmeans.R#L332-L389, to figure out if this is true. And I have trouble wrapping my head around categorical nuisance variables, especially when there is more than 1.