openpharma / brms.mmrm

R package to run Bayesian MMRMs using {brms}
https://openpharma.github.io/brms.mmrm/
Other
18 stars 2 forks source link

Explain and verify how marginal means are computed #53

Closed wlandau closed 6 months ago

wlandau commented 1 year ago

c.f. https://github.com/RConsortium/brms.mmrm/issues/51#issuecomment-1671146686. I started a vignette at https://rconsortium.github.io/brms.mmrm/articles/methods.html with the goal of explaining every part of our method, and it is sparse on how emmeans transforms posterior samples of model coefficients into marginal means. I understand the emmeans docs and https://www.jstatsoft.org/article/view/v069i01 correctly, the current behavior seems reasonable:

https://github.com/RConsortium/brms.mmrm/blob/0fee37475e317af31d36791d336e18b402bd9ac0/R/brm_marginal_draws.R#L109-L114

I think this means each marginal mean is an average over the predictions at a given combination of study arm and time point, where continuous nuisance variables are set to the mean and categorical nuisance variables are averaged in a weighted way according to how levels are represented in the data. I think it would be good to dig into the code, e.g. https://github.com/rvlenth/emmeans/blob/master/R/emmeans.R#L332-L389, to figure out if this is true. And I have trouble wrapping my head around categorical nuisance variables, especially when there is more than 1.

wlandau commented 1 year ago

As an exercise, it would be great to code 1 or 2 examples from first principles and compare them to output from emmeans.

yonicd commented 8 months ago

@wlandau This package may be helpful to see how margins are calculated in brms.

wlandau commented 7 months ago

Looks like brmsmargins puts a lot of effort into the correct handling of random effects. Fortunately, we only have fixed effects. But the first section on "Integrating out Random Effects" does have clues about how to treat fixed effects by using the model matrix.

wlandau commented 7 months ago

Using a simple MMRM with no covariates, it is straightforward to use the model matrix to derive draws of marginal means directly. Agreement with emmeans is spot on in this case.

suppressPackageStartupMessages({
  library(brms)
  library(coda)
  library(dplyr)
  library(posterior)
  library(tibble)
  library(tidyr)
})

custom_marginal_draws <- function(model, formula, data) {
  # Get the draws of the fixed effects.
  draws <- model %>%
    as_draws_df() %>%
    as_tibble() %>%
    select(starts_with("b_"), -starts_with("b_sigma"))

  # Derive the model matrix (with the correct row order).
  stan_data <- make_standata(formula = formula, data = data)
  brms_permutation <- match(x = stan_data$Y, table = data$response)
  undo_brms_permutation <- match(x = data$response, table = stan_data$Y)
  stopifnot(all(stan_data$Y[undo_brms_permutation] == data$response))
  model_matrix <- as.data.frame(stan_data$X[undo_brms_permutation, ])

  # Take the marginal means of the model matrix (by column).
  colnames(model_matrix) <- paste0("x_", colnames(model_matrix))
  joined <- dplyr::bind_cols(data, model_matrix)
  grid <- joined %>%
    group_by(group, time) %>%
    summarize(across(starts_with("x_"), mean), .groups = "drop")
  reference <- select(grid, -starts_with("x_"))
  transformation <- grid %>%
    select(starts_with("x_")) %>%
    as.matrix() %>%
    t()
  colnames(transformation) <- paste0(grid$group, "_", grid$time)

  # Convert draws of fixed effects to draws of marginal means.
  out <- as.matrix(draws) %*% transformation
  colnames(out) <- paste0(grid$group, "_", grid$time)
  out
}

data <- tibble(
  group = rep(c("treatment", "placebo"), each = 100L),
  patient = paste0("patient_", seq_len(200L))
) %>%
  expand_grid(time = paste0("time_", seq_len(4L))) %>%
  mutate(response = rnorm(nrow(.)))

formula <- brmsformula(
  formula = response ~ 0 + group*time + unstr(time = time, gr = patient),
  sigma ~ 0 + time
)

model <- brm(formula = formula, data = data, refresh = 0, chains = 1)

emmeans::emm_options(sep = "_")

# Automatic from the brms/emmeans integration:
draws_emmeans <- emmeans::emmeans(
  object = model,
  specs = ~group:time,
  wt.nuis = "proportional",
  nuisance = "patient"
) %>%
  as.mcmc(fixed = TRUE, names = FALSE) %>%
  as_draws_df() %>%
  as.data.frame() %>%
  as_tibble() %>%
  select(-starts_with("."))

# Custom draws using the model matrix:
draws_custom <- custom_marginal_draws(
  model = model,
  formula = formula,
  data = data
)[, colnames(draws_emmeans)]

# Discrepancy betwen the two approaches:
max(abs(as.matrix(draws_custom) - as.matrix(draws_emmeans)))
#> [1] 0

Above, I derived a transformation matrix to control the mapping between draws of fixed effects (rows) and draws of the marginal means we want (columns).

> transformation
                            placebo_time_1 placebo_time_2 placebo_time_3 placebo_time_4 treatment_time_1 treatment_time_2 treatment_time_3 treatment_time_4
x_groupplacebo                           1              1              1              1                0                0                0                0
x_grouptreatment                         0              0              0              0                1                1                1                1
x_timetime_2                             0              1              0              0                0                1                0                0
x_timetime_3                             0              0              1              0                0                0                1                0
x_timetime_4                             0              0              0              1                0                0                0                1
x_grouptreatment:timetime_2              0              0              0              0                0                1                0                0
x_grouptreatment:timetime_3              0              0              0              0                0                0                1                0
x_grouptreatment:timetime_4              0              0              0              0                0                0                0                1

I will try and see if this same approach works with covariates.

wlandau commented 7 months ago

Starting with the FEV dataset and filtering out missing observations, I get the same result when there are no extra covariates:

suppressPackageStartupMessages({
  library(brms)
  library(brms.mmrm)
  library(coda)
  library(dplyr)
  library(posterior)
  library(tibble)
  library(tidyr)
})

custom_marginal_draws <- function(model, formula, data) {
  # Get the draws of the fixed effects.
  draws <- model %>%
    as_draws_df() %>%
    as_tibble() %>%
    select(starts_with("b_"), -starts_with("b_sigma"))

  # Derive the model matrix (with the correct row order).
  stan_data <- make_standata(formula = formula, data = data)
  brms_permutation <- match(x = stan_data$Y, table = data$change)
  undo_brms_permutation <- match(x = data$change, table = stan_data$Y)
  stopifnot(all(stan_data$Y[undo_brms_permutation] == data$change))
  model_matrix <- as.data.frame(stan_data$X[undo_brms_permutation, ])

  # Take the marginal means of the model matrix (by column).
  colnames(model_matrix) <- paste0("x_", colnames(model_matrix))
  joined <- dplyr::bind_cols(data, model_matrix)
  grid <- joined %>%
    group_by(ARMCD, AVISIT) %>%
    summarize(across(starts_with("x_"), mean), .groups = "drop")
  reference <- select(grid, -starts_with("x_"))
  transformation <- grid %>%
    select(starts_with("x_")) %>%
    as.matrix() %>%
    t()
  colnames(transformation) <- paste0(grid$ARMCD, "_", grid$AVISIT)

  # Convert draws of fixed effects to draws of marginal means.
  out <- as.matrix(draws) %*% transformation
  colnames(out) <- paste0(grid$ARMCD, "_", grid$AVISIT)
  out
}

# Use the FEV1 dataset from {mmrm}
data(fev_data, package = "mmrm")
data <- fev_data %>%
  mutate(change = FEV1 - FEV1_BL) %>%
  filter(!is.na(change))

# formula <- brmsformula(
#   formula = change ~ FEV1_BL + RACE + SEX + WEIGHT + ARMCD*AVISIT +
#     unstr(time = AVISIT, gr = USUBJID),
#   sigma ~ 0 + AVISIT
# )

formula <- brmsformula(
  formula = change ~ ARMCD*AVISIT +
    unstr(time = AVISIT, gr = USUBJID),
  sigma ~ 0 + AVISIT
)

model <- brm(formula = formula, data = data, refresh = 0, chains = 1)

emmeans::emm_options(sep = "_")

# Automatic from the brms/emmeans integration:
draws_emmeans <- emmeans::emmeans(
  object = model,
  specs = ~ARMCD:AVISIT,
  wt.nuis = "proportional",
  nuisance = c("USUBJID")#, "FEV1_BL", "RACE", "SEX", "WEIGHT")
) %>%
  as.mcmc(fixed = TRUE, names = FALSE) %>%
  as_draws_df() %>%
  as.data.frame() %>%
  as_tibble() %>%
  select(-starts_with("."))

# Custom draws using the model matrix:
draws_custom <- custom_marginal_draws(
  model = model,
  formula = formula,
  data = data
)[, colnames(draws_emmeans)]

# Discrepancy betwen the two approaches:
print(max(abs(as.matrix(draws_custom) - as.matrix(draws_emmeans))))
#> [1] 0
wlandau commented 7 months ago

Disagreements start to happen when I add a simple continuous fixed effect like FEV1_BL. I am not sure why.

suppressPackageStartupMessages({
  library(brms)
  library(coda)
  library(dplyr)
  library(posterior)
  library(tibble)
  library(tidyr)
})

custom_marginal_draws <- function(model, formula, data) {
  # Get the draws of the fixed effects.
  draws <- model %>%
    as_draws_df() %>%
    as_tibble() %>%
    select(starts_with("b_"), -starts_with("b_sigma"))

  # Derive the model matrix (with the correct row order).
  stan_data <- make_standata(formula = formula, data = data)
  brms_permutation <- match(x = stan_data$Y, table = data$change)
  undo_brms_permutation <- match(x = data$change, table = stan_data$Y)
  stopifnot(all(stan_data$Y[undo_brms_permutation] == data$change))
  model_matrix <- as.data.frame(stan_data$X[undo_brms_permutation, ])

  # Take the marginal means of the model matrix (by column).
  colnames(model_matrix) <- paste0("x_", colnames(model_matrix))
  joined <- dplyr::bind_cols(data, model_matrix)
  grid <- joined %>%
    group_by(ARMCD, AVISIT) %>%
    summarize(across(starts_with("x_"), mean), .groups = "drop")
  reference <- select(grid, -starts_with("x_"))
  transformation <- grid %>%
    select(starts_with("x_")) %>%
    as.matrix() %>%
    t()
  colnames(transformation) <- paste0(grid$ARMCD, "_", grid$AVISIT)

  # Convert draws of fixed effects to draws of marginal means.
  out <- as.matrix(draws) %*% transformation
  colnames(out) <- paste0(grid$ARMCD, "_", grid$AVISIT)
  out
}

# Use the FEV1 dataset from {mmrm}
data(fev_data, package = "mmrm")
data <- fev_data %>%
  mutate(change = FEV1 - FEV1_BL) %>%
  filter(!is.na(change))

# formula <- brmsformula(
#   formula = change ~ FEV1_BL + RACE + SEX + WEIGHT + ARMCD*AVISIT +
#     unstr(time = AVISIT, gr = USUBJID),
#   sigma ~ 0 + AVISIT
# )

formula <- brmsformula(
  formula = change ~ FEV1_BL + ARMCD*AVISIT +
    unstr(time = AVISIT, gr = USUBJID),
  sigma ~ 0 + AVISIT
)

model <- brm(formula = formula, data = data, refresh = 0, chains = 1)

emmeans::emm_options(sep = "_")

# Automatic from the brms/emmeans integration:
draws_emmeans <- emmeans::emmeans(
  object = model,
  specs = ~ARMCD:AVISIT,
  wt.nuis = "proportional",
  nuisance = c("USUBJID", "FEV1_BL")#, "RACE", "SEX", "WEIGHT")
) %>%
  as.mcmc(fixed = TRUE, names = FALSE) %>%
  as_draws_df() %>%
  as.data.frame() %>%
  as_tibble() %>%
  select(-starts_with("."))

# Custom draws using the model matrix:
draws_custom <- custom_marginal_draws(
  model = model,
  formula = formula,
  data = data
)[, colnames(draws_emmeans)]

# Discrepancy betwen the two approaches:
print(max(abs(as.matrix(draws_custom) - as.matrix(draws_emmeans))))
#> [1] 1.262745
yonicd commented 7 months ago

This could be a case of average marginal effects (AME) vs marginal effects at the mean (MEM). In continuous RHS variables it is more complicated (and even more so when doing interactions) than compared to discrete.

wlandau commented 7 months ago

With the settings above, emmeans apparently sets each continuous nuisance covariate equal to its grand mean when computing marginal means. Indeed, if I have a model with non-interacting continuous covariates:

formula <- brmsformula(
  formula = change ~ FEV1_BL + WEIGHT + ARMCD*AVISIT +
    unstr(time = AVISIT, gr = USUBJID),
  sigma ~ 0 + AVISIT
)

and custom_marginal_draws() puts the grand means in the model matrix:

model_matrix[, "FEV1_BL"] <- mean(model_matrix[, "FEV1_BL"])
model_matrix[, "WEIGHT"] <- mean(model_matrix[, "WEIGHT"])

then I see a discrepancy of exactly 0.

wlandau commented 7 months ago

Your point about interactions, especially interactions between continuous and discrete variables, is a good one. This gets complicated very quickly.

wlandau commented 7 months ago

In a much simpler case with a non-interacting factor with two levels, I am having trouble replicating what emmeans does. https://cran.r-project.org/web/packages/emmeans/vignettes/messy-data.html#nuisance claims that nuisance factors are averaged out, but I am not sure what kind of average is actually used. With a straightforward mean of each column of the model matrix, I get a strong discrepancy. Same for the median.

suppressPackageStartupMessages({
  library(brms)
  library(coda)
  library(dplyr)
  library(posterior)
  library(tibble)
  library(tidyr)
})

custom_marginal_draws <- function(model, formula, data) {
  # Get the draws of the fixed effects.
  draws <- model %>%
    as_draws_df() %>%
    as_tibble() %>%
    select(starts_with("b_"), -starts_with("b_sigma"))

  # Derive the model matrix (with the correct row order).
  stan_data <- make_standata(formula = formula, data = data)
  brms_permutation <- match(x = stan_data$Y, table = data$change)
  undo_brms_permutation <- match(x = data$change, table = stan_data$Y)
  stopifnot(all(stan_data$Y[undo_brms_permutation] == data$change))
  model_matrix <- as.data.frame(stan_data$X[undo_brms_permutation, ])

  # Average over the SEX covariate (nuisance variable).
  model_matrix[, "SEXFemale"] <- mean(model_matrix[, "SEXFemale"])

  # Take the marginal means of the model matrix (by column).
  colnames(model_matrix) <- paste0("x_", colnames(model_matrix))
  joined <- dplyr::bind_cols(data, model_matrix)
  grid <- joined %>%
    group_by(ARMCD, AVISIT) %>%
    summarize(across(starts_with("x_"), mean), .groups = "drop")
  reference <- select(grid, -starts_with("x_"))
  transformation <- grid %>%
    select(starts_with("x_")) %>%
    as.matrix() %>%
    t()
  colnames(transformation) <- paste0(grid$ARMCD, "_", grid$AVISIT)

  # Convert draws of fixed effects to draws of marginal means.
  out <- as.matrix(draws) %*% transformation
  colnames(out) <- paste0(grid$ARMCD, "_", grid$AVISIT)
  out
}

# Use the FEV1 dataset from {mmrm}
data(fev_data, package = "mmrm")
data <- fev_data %>%
  mutate(change = FEV1 - FEV1_BL) %>%
  filter(!is.na(change))

formula <- brmsformula(
  formula = change ~ SEX + ARMCD*AVISIT +
    unstr(time = AVISIT, gr = USUBJID),
  sigma ~ 0 + AVISIT
)

model <- brm(formula = formula, data = data, refresh = 0, chains = 1)

# Automatic from the brms/emmeans integration:
emmeans::emm_options(sep = "_")
draws_emmeans <- emmeans::emmeans(
  object = model,
  specs = ~ARMCD:AVISIT,
  wt.nuis = "proportional",
  nuisance = c("USUBJID", "SEX")
) %>%
  as.mcmc(fixed = TRUE, names = FALSE) %>%
  as_draws_df() %>%
  as.data.frame() %>%
  as_tibble() %>%
  select(-starts_with("."))

# Custom draws using the model matrix:
draws_custom <- custom_marginal_draws(
  model = model,
  formula = formula,
  data = data
)[, colnames(draws_emmeans)]

# Discrepancy betwen the two approaches:
print(max(abs(as.matrix(draws_custom) - as.matrix(draws_emmeans))))
#> [1] 12.48768
wlandau commented 7 months ago

I wonder if this is a problem with the brms/emmeans integration. When I just use a simple frequentist model, the discrepancy goes away.

library(emmeans)
library(tidyverse)

# Use the FEV1 dataset from {mmrm}
data(fev_data, package = "mmrm")
data <- fev_data %>%
  mutate(change = FEV1 - FEV1_BL) %>%
  filter(!is.na(change))
formula <- change ~ SEX + ARMCD*AVISIT
model <- lm(formula, data = data)

# {emmeans} marginals
marginals_emmeans <- emmeans::emmeans(
  object = model,
  specs = ~ARMCD:AVISIT,
  wt.nuis = "proportional",
  nuisance = c("USUBJID", "SEX")
) %>%
  as.data.frame() %>%
  as_tibble() %>%
  mutate(marginal = paste0(ARMCD, "_", AVISIT)) %>%
  select(marginal, emmean)

# Create a mapping from model parameters to marginal means,
# attempting to pre-average over nuisance variables:
model_matrix <- model.matrix(formula, data = data)
model_matrix[, "SEXFemale"] <- mean(model_matrix[, "SEXFemale"]) # nuisance variable
colnames(model_matrix) <- paste0("x_", colnames(model_matrix))
joined <- dplyr::bind_cols(data, model_matrix)
grid <- joined %>%
  group_by(ARMCD, AVISIT) %>%
  summarize(across(starts_with("x_"), mean), .groups = "drop")
mapping <- grid %>%
  select(starts_with("x_")) %>%
  as.matrix()
colnames(mapping) <- gsub("^x_", "", colnames(mapping))
rownames(mapping) <- paste0(grid$ARMCD, "_", grid$AVISIT)
print(mapping)
#>          (Intercept) SEXFemale ARMCDTRT AVISITVIS2 AVISITVIS3 AVISITVIS4
#> PBO_VIS1           1 0.5363128        0          0          0          0
#> PBO_VIS2           1 0.5363128        0          1          0          0
#> PBO_VIS3           1 0.5363128        0          0          1          0
#> PBO_VIS4           1 0.5363128        0          0          0          1
#> TRT_VIS1           1 0.5363128        1          0          0          0
#> TRT_VIS2           1 0.5363128        1          1          0          0
#> TRT_VIS3           1 0.5363128        1          0          1          0
#> TRT_VIS4           1 0.5363128        1          0          0          1
#>          ARMCDTRT:AVISITVIS2 ARMCDTRT:AVISITVIS3 ARMCDTRT:AVISITVIS4
#> PBO_VIS1                   0                   0                   0
#> PBO_VIS2                   0                   0                   0
#> PBO_VIS3                   0                   0                   0
#> PBO_VIS4                   0                   0                   0
#> TRT_VIS1                   0                   0                   0
#> TRT_VIS2                   1                   0                   0
#> TRT_VIS3                   0                   1                   0
#> TRT_VIS4                   0                   0                   1

# Use the mapping to transform model coefficients to marginal means:
custom <- mapping %*% coefficients(model)
marginals_custom <- tibble::tibble(
  marginal = rownames(custom),
  custom = as.numeric(custom)
)

# Comparison
comparison <- dplyr::left_join(
  marginals_emmeans,
  marginals_custom,
  by = "marginal"
) %>%
  mutate(difference = custom - emmean)
print(comparison)
#> # A tibble: 8 × 4
#>   marginal emmean custom difference
#>   <chr>     <dbl>  <dbl>      <dbl>
#> 1 PBO_VIS1  -8.12  -8.12          0
#> 2 TRT_VIS1  -2.06  -2.06          0
#> 3 PBO_VIS2  -3.33  -3.33          0
#> 4 TRT_VIS2   1.73   1.73          0
#> 5 PBO_VIS3   2.45   2.45          0
#> 6 TRT_VIS3   5.32   5.32          0
#> 7 PBO_VIS4   8.20   8.20          0
#> 8 TRT_VIS4  13.0   13.0           0

Created on 2024-03-19 with reprex v2.1.0

wlandau commented 7 months ago

The discrepancy seems like a brms issue, c.f. https://github.com/paul-buerkner/brms/issues/1630.

One path forward would be to replicate frequentist emmeans ourselves, once we understand how emmeans handles interactions in both the discrete and the continuous case.

We might also consider just not copying the idea of "nuisance" parameters at all. From https://cran.r-project.org/web/packages/emmeans/vignettes/messy-data.html#nuisance, nuisance parameters are a convenience/efficiency issue, not an accuracy issue. If I can do more experiments in the frequentist case with models using interactions, and show how to replicate everything using custom code, I think it will be clear how to adapt this approach for our posterior samples.

wlandau commented 7 months ago

The section titled "Emphasis on experimental data" is useful:

To start off with, we should emphasize that the underpinnings of estimated marginal means – and much of what the emmeans package offers – relate more to experimental data than to observational data. In observational data, we sample from some population, and the goal of statistical analysis is to characterize that population in some way. In contrast, with experimental data, the experimenter controls the environment under which test runs are conducted, and in which responses are observed and recorded. Thus with experimentation, the population is an abstract entity consisting of potential outcomes of test runs made under conditions we enforce, rather than a physical entity that we observe without changing it.

We say this because the default behavior of the emmeans() function is to average groups together with equal weights; this is common in analysis of experiments, but not common in analysis of observational data;

So from what I gather:

  1. For experiments, the default emmeans(weights = "equal") gives each level of each factor equal representation.
  2. For observational studies, emmeans(weights = "cells") takes into account how frequently each factor level occurs in the rows of the data.

Excluding interactions, I can reproduce both cases above. The minor discrepancies below are small enough to be rounding errors.

Equal weights (experiments)

library(emmeans)
library(tidyverse)

# Use the FEV1 dataset from {mmrm}
data(fev_data, package = "mmrm")
data <- fev_data %>%
  mutate(change = FEV1 - FEV1_BL) %>%
  filter(!is.na(change))
formula <- change ~ FEV1_BL + WEIGHT + RACE + SEX + ARMCD*AVISIT
model <- lm(formula, data = data)

# {emmeans} marginals
marginals_emmeans <- emmeans::emmeans(
  object = model,
  specs = ~ARMCD:AVISIT,
  weights = "equal"
) %>%
  as.data.frame() %>%
  as_tibble() %>%
  mutate(marginal = paste0(ARMCD, "_", AVISIT)) %>%
  select(marginal, emmean)

# Create a mapping from model parameters to marginal means,
# attempting to pre-average over nuisance variables:
model_matrix <- model.matrix(formula, data = data)
model_matrix[, "FEV1_BL"] <- mean(model_matrix[, "FEV1_BL"])
model_matrix[, "WEIGHT"] <- mean(model_matrix[, "WEIGHT"])
model_matrix[, "SEXFemale"] <- 0.5
model_matrix[, "RACEBlack or African American"] <- 1 / 3
model_matrix[, "RACEWhite"] <- 1 / 3
colnames(model_matrix) <- paste0("x_", colnames(model_matrix))
joined <- dplyr::bind_cols(data, model_matrix)
grid <- joined %>%
  group_by(ARMCD, AVISIT) %>%
  summarize(across(starts_with("x_"), mean), .groups = "drop")
mapping <- grid %>%
  select(starts_with("x_")) %>%
  as.matrix()
colnames(mapping) <- gsub("^x_", "", colnames(mapping))
rownames(mapping) <- paste0(grid$ARMCD, "_", grid$AVISIT)

# Use the mapping to transform model coefficients to marginal means:
custom <- mapping %*% coefficients(model)
marginals_custom <- tibble::tibble(
  marginal = rownames(custom),
  custom = as.numeric(custom)
)

# Comparison
comparison <- dplyr::left_join(
  marginals_emmeans,
  marginals_custom,
  by = "marginal"
) %>%
  mutate(difference = custom - emmean)
print(comparison)
#> # A tibble: 8 × 4
#>   marginal emmean custom difference
#>   <chr>     <dbl>  <dbl>      <dbl>
#> 1 PBO_VIS1  -6.98  -6.98  -5.33e-15
#> 2 TRT_VIS1  -3.11  -3.11  -5.33e-15
#> 3 PBO_VIS2  -2.22  -2.22  -5.33e-15
#> 4 TRT_VIS2   1.84   1.84  -5.33e-15
#> 5 PBO_VIS3   3.50   3.50  -5.33e-15
#> 6 TRT_VIS3   6.23   6.23  -5.33e-15
#> 7 PBO_VIS4   8.26   8.26  -5.33e-15
#> 8 TRT_VIS4  12.3   12.3   -5.33e-15

Created on 2024-03-19 with reprex v2.1.0

"Cell" weights (observational studies)

library(emmeans)
library(tidyverse)

# Use the FEV1 dataset from {mmrm}
data(fev_data, package = "mmrm")
data <- fev_data %>%
  mutate(change = FEV1 - FEV1_BL) %>%
  filter(!is.na(change))
formula <- change ~ FEV1_BL + WEIGHT + RACE + SEX + ARMCD*AVISIT
model <- lm(formula, data = data)

# {emmeans} marginals
marginals_emmeans <- emmeans::emmeans(
  object = model,
  specs = ~ARMCD:AVISIT,
  weights = "cells"
) %>%
  as.data.frame() %>%
  as_tibble() %>%
  mutate(marginal = paste0(ARMCD, "_", AVISIT)) %>%
  select(marginal, emmean)

# Create a mapping from model parameters to marginal means,
# attempting to pre-average over nuisance variables:
model_matrix <- model.matrix(formula, data = data)
model_matrix[, "FEV1_BL"] <- mean(model_matrix[, "FEV1_BL"])
model_matrix[, "WEIGHT"] <- mean(model_matrix[, "WEIGHT"])
# model_matrix[, "SEXFemale"] <- 0.5
# model_matrix[, "RACEBlack or African American"] <- 1 / 3
# model_matrix[, "RACEWhite"] <- 1 / 3
colnames(model_matrix) <- paste0("x_", colnames(model_matrix))
joined <- dplyr::bind_cols(data, model_matrix)
grid <- joined %>%
  group_by(ARMCD, AVISIT) %>%
  summarize(across(starts_with("x_"), mean), .groups = "drop")
mapping <- grid %>%
  select(starts_with("x_")) %>%
  as.matrix()
colnames(mapping) <- gsub("^x_", "", colnames(mapping))
rownames(mapping) <- paste0(grid$ARMCD, "_", grid$AVISIT)

# Use the mapping to transform model coefficients to marginal means:
custom <- mapping %*% coefficients(model)
marginals_custom <- tibble::tibble(
  marginal = rownames(custom),
  custom = as.numeric(custom)
)

# Comparison
comparison <- dplyr::left_join(
  marginals_emmeans,
  marginals_custom,
  by = "marginal"
) %>%
  mutate(difference = custom - emmean)
print(comparison)
#> # A tibble: 8 × 4
#>   marginal emmean custom difference
#>   <chr>     <dbl>  <dbl>      <dbl>
#> 1 PBO_VIS1  -7.79  -7.79   1.78e-15
#> 2 TRT_VIS1  -3.20  -3.20   2.66e-15
#> 3 PBO_VIS2  -2.86  -2.86   1.78e-15
#> 4 TRT_VIS2   1.73   1.73   1.78e-15
#> 5 PBO_VIS3   2.89   2.89   2.66e-15
#> 6 TRT_VIS3   6.37   6.37   1.78e-15
#> 7 PBO_VIS4   7.67   7.67   8.88e-15
#> 8 TRT_VIS4  12.4   12.4    1.78e-15

Created on 2024-03-19 with reprex v2.1.0

wlandau commented 7 months ago

Even in an experimental setting, I do not think it is a good idea to give equal weights to all the levels of a factor.

For example: in an Alzheimer's study, it is common to regress on baseline disease severity. Those levels might be "mild" and "severe". Suppose the experiment has mostly "mild" patients and only a small number of "severe" ones. In the equal weighting scheme, there will be a marginal mean for "mild" and another marginal mean for "severe", and these means will be straightforwardly averaged together. But if there are far more "mild" patients than "severe" ones, this scheme risks misrepresenting the composition of the overall patient population. In particular, if the drug is only effective in the mild subpopulation, then equal weighting will artificially dampen the treatment effect.

Another example: we often regress on the choice of investigative site in multi-center studies. The site shouldn't matter for inference, we just want to explain variability in the model. If one or two sites have the most patients, and most sites have a very small number of patients, what would make the most sense to do? It seems more intuitive to me to let the number of patients in each site determine how the sites are weighted, as opposed to averaging all sites equally.

wlandau commented 7 months ago

I think all this would be worth discussing in our group meeting tomorrow.

wlandau commented 7 months ago

https://stats.stackexchange.com/questions/510862/is-least-squares-means-lsmeans-statistical-nonsense/510923#510923 is edifying, and it helps me understand more about why estimated marginal means with equal weights are used, as well as their limitations (e.g. interactions).

wlandau commented 7 months ago

From that post:

Is this useful? Definitely yes, when the model is additive and/or the interaction effects are small. They mimic the types of analyses proposed for balanced experiments by the giants among our forebears. If, on the other hand, we do have sizeable interaction effects, then (as our forebears recommend) we simply should not even consider estimating marginal means. (In a multi-factor experiment, it may still be reasonable to construct some marginal means where we are averaging only over negligible interaction effects).

brms.mmrm has interaction terms which may be highly significant, both in comparisons of interest and in blocking variables.

wlandau commented 7 months ago

For some Bayesian models, I have implemented "cell weighting" for all variables. The idea is to first generate patient-level predictions y*_ijk from the model (arm i, time j, patient k), then calculate an estimated mean m_ij= (1 / K_i) Sum_{k = 1}^{K_i} y*_ijk. I think this approach may be what we want for uninteresting covariates like WEIGHT and RACE which were not controlled by the study design. But we should give more thought to the weighting scheme for the levels of treatment group and time point. Equal weighting for treatment and time sounds nice until we remember that we may have interactions.

wlandau commented 7 months ago

Equal weighting for treatment and time sounds nice until we remember that we may have interactions.

In the simple cases of both formula <- change ~ ARMCD + AVISIT and formula <- change ~ ARMCD*AVISIT, I get the exact same perfect agreement whether I use weights = "equal" or weights = "cells". In fact, weights = "invalid" appears to work too in these simplest cases because weights is not actually used. That makes me think cell-weighting could work for all variables.

With just treatment and time, I am getting perfect agreement even when the data is severely unbalanced:

``` r library(emmeans) library(tidyverse) # Use the FEV1 dataset from {mmrm} data(fev_data, package = "mmrm") data <- fev_data %>% mutate(change = FEV1 - FEV1_BL) %>% filter(!is.na(change)) x <- c("PT100", "PT102", "PT103", "PT104", "PT112", "PT113", "PT114", "PT115", "PT117", "PT118", "PT12", "PT120", "PT123", "PT125", "PT131", "PT135", "PT140", "PT144", "PT145", "PT147", "PT149", "PT17", "PT23", "PT31", "PT34", "PT37", "PT38", "PT39", "PT40", "PT42", "PT45", "PT46", "PT49", "PT5", "PT53", "PT56", "PT64", "PT69", "PT7", "PT70", "PT74", "PT8", "PT82", "PT87", "PT88", "PT89", "PT90", "PT94", "PT96", "PT98") data <- data[!(data$USUBJID %in% x), ] print(count(data, ARMCD, AVISIT)) #> ARMCD AVISIT n #> 1 PBO VIS1 18 #> 2 PBO VIS2 36 #> 3 PBO VIS3 33 #> 4 PBO VIS4 39 #> 5 TRT VIS1 66 #> 6 TRT VIS2 71 #> 7 TRT VIS3 58 #> 8 TRT VIS4 67 formula <- change ~ ARMCD + AVISIT model <- lm(formula, data = data) # {emmeans} marginals marginals_emmeans <- emmeans::emmeans( object = model, specs = ~ARMCD:AVISIT, weights = "equal" ) %>% as.data.frame() %>% as_tibble() %>% mutate(marginal = paste0(ARMCD, "_", AVISIT)) %>% select(marginal, emmean) # Create a mapping from model parameters to marginal means, # attempting to pre-average over nuisance variables: model_matrix <- model.matrix(formula, data = data) # model_matrix[, "FEV1_BL"] <- mean(model_matrix[, "FEV1_BL"]) # model_matrix[, "WEIGHT"] <- mean(model_matrix[, "WEIGHT"]) # model_matrix[, "SEXFemale"] <- 0.5 # model_matrix[, "RACEBlack or African American"] <- 1 / 3 # model_matrix[, "RACEWhite"] <- 1 / 3 colnames(model_matrix) <- paste0("x_", colnames(model_matrix)) joined <- dplyr::bind_cols(data, model_matrix) grid <- joined %>% group_by(ARMCD, AVISIT) %>% summarize(across(starts_with("x_"), mean), .groups = "drop") mapping <- grid %>% select(starts_with("x_")) %>% as.matrix() colnames(mapping) <- gsub("^x_", "", colnames(mapping)) rownames(mapping) <- paste0(grid$ARMCD, "_", grid$AVISIT) # Use the mapping to transform model coefficients to marginal means: custom <- mapping %*% coefficients(model) marginals_custom <- tibble::tibble( marginal = rownames(custom), custom = as.numeric(custom) ) # Comparison comparison <- dplyr::left_join( marginals_emmeans, marginals_custom, by = "marginal" ) %>% mutate(difference = custom - emmean) print(comparison) #> # A tibble: 8 × 4 #> marginal emmean custom difference #> #> 1 PBO_VIS1 -7.30 -7.30 0 #> 2 TRT_VIS1 -2.64 -2.64 0 #> 3 PBO_VIS2 -3.13 -3.13 0 #> 4 TRT_VIS2 1.53 1.53 0 #> 5 PBO_VIS3 1.71 1.71 0 #> 6 TRT_VIS3 6.37 6.37 0 #> 7 PBO_VIS4 8.28 8.28 0 #> 8 TRT_VIS4 12.9 12.9 0 ``` Created on 2024-03-19 with [reprex v2.1.0](https://reprex.tidyverse.org)

Similar story for lots of uninteresting covariates without interactions (except there is rounding error). This one uses weights = "equal"

``` r library(emmeans) library(tidyverse) # Use the FEV1 dataset from {mmrm} data(fev_data, package = "mmrm") data <- fev_data %>% mutate(change = FEV1 - FEV1_BL) %>% filter(!is.na(change)) x <- c("PT100", "PT102", "PT103", "PT104", "PT112", "PT113", "PT114", "PT115", "PT117", "PT118", "PT12", "PT120", "PT123", "PT125", "PT131", "PT135", "PT140", "PT144", "PT145", "PT147", "PT149", "PT17", "PT23", "PT31", "PT34", "PT37", "PT38", "PT39", "PT40", "PT42", "PT45", "PT46", "PT49", "PT5", "PT53", "PT56", "PT64", "PT69", "PT7", "PT70", "PT74", "PT8", "PT82", "PT87", "PT88", "PT89", "PT90", "PT94", "PT96", "PT98") data <- data[!(data$USUBJID %in% x), ] formula <- change ~ FEV1_BL + WEIGHT + RACE + SEX + ARMCD*AVISIT model <- lm(formula, data = data) # {emmeans} marginals marginals_emmeans <- emmeans::emmeans( object = model, specs = ~ARMCD:AVISIT, weights = "equal", nuisance = "USUBJID" ) %>% as.data.frame() %>% as_tibble() %>% mutate(marginal = paste0(ARMCD, "_", AVISIT)) %>% select(marginal, emmean) # Create a mapping from model parameters to marginal means, # attempting to pre-average over nuisance variables: model_matrix <- model.matrix(formula, data = data) model_matrix[, "FEV1_BL"] <- mean(model_matrix[, "FEV1_BL"]) model_matrix[, "WEIGHT"] <- mean(model_matrix[, "WEIGHT"]) model_matrix[, "SEXFemale"] <- 0.5 model_matrix[, "RACEBlack or African American"] <- 1 / 3 model_matrix[, "RACEWhite"] <- 1 / 3 colnames(model_matrix) <- paste0("x_", colnames(model_matrix)) joined <- dplyr::bind_cols(data, model_matrix) grid <- joined %>% group_by(ARMCD, AVISIT) %>% summarize(across(starts_with("x_"), mean), .groups = "drop") mapping <- grid %>% select(starts_with("x_")) %>% as.matrix() colnames(mapping) <- gsub("^x_", "", colnames(mapping)) rownames(mapping) <- paste0(grid$ARMCD, "_", grid$AVISIT) # Use the mapping to transform model coefficients to marginal means: custom <- mapping %*% coefficients(model) marginals_custom <- tibble::tibble( marginal = rownames(custom), custom = as.numeric(custom) ) # Comparison comparison <- dplyr::left_join( marginals_emmeans, marginals_custom, by = "marginal" ) %>% mutate(difference = custom - emmean) print(comparison) #> # A tibble: 8 × 4 #> marginal emmean custom difference #> #> 1 PBO_VIS1 -8.13 -8.13 -3.55e-15 #> 2 TRT_VIS1 -3.00 -3.00 -3.55e-15 #> 3 PBO_VIS2 -2.57 -2.57 -3.55e-15 #> 4 TRT_VIS2 1.98 1.98 -3.55e-15 #> 5 PBO_VIS3 3.19 3.19 -3.55e-15 #> 6 TRT_VIS3 6.37 6.37 -3.55e-15 #> 7 PBO_VIS4 8.86 8.86 -3.55e-15 #> 8 TRT_VIS4 12.4 12.4 -3.55e-15 ``` Created on 2024-03-19 with [reprex v2.1.0](https://reprex.tidyverse.org)

By commenting out the custom averaging for categorical levels, I can reproduce the same outcome as emmeans(weights = "cells"):

``` r library(emmeans) library(tidyverse) # Use the FEV1 dataset from {mmrm} data(fev_data, package = "mmrm") data <- fev_data %>% mutate(change = FEV1 - FEV1_BL) %>% filter(!is.na(change)) formula <- change ~ FEV1_BL + WEIGHT + SEX*RACE + ARMCD*AVISIT model <- lm(formula, data = data) # {emmeans} marginals marginals_emmeans <- emmeans::emmeans( object = model, specs = ~ARMCD:AVISIT, weights = "cells", nuisance = "USUBJID" ) %>% as.data.frame() %>% as_tibble() %>% mutate(marginal = paste0(ARMCD, "_", AVISIT)) %>% select(marginal, emmean) # Create a mapping from model parameters to marginal means, # attempting to pre-average over nuisance variables: model_matrix <- model.matrix(formula, data = data) model_matrix[, "FEV1_BL"] <- mean(model_matrix[, "FEV1_BL"]) model_matrix[, "WEIGHT"] <- mean(model_matrix[, "WEIGHT"]) # model_matrix[, "SEXFemale"] <- 0.5 # model_matrix[, "RACEBlack or African American"] <- 1 / 3 # model_matrix[, "RACEWhite"] <- 1 / 3 # model_matrix[, "SEXFemale:RACEBlack or African American"] <- 1 / 6 # model_matrix[, "SEXFemale:RACEWhite"] <- 1 / 6 colnames(model_matrix) <- paste0("x_", colnames(model_matrix)) joined <- dplyr::bind_cols(data, model_matrix) grid <- joined %>% group_by(ARMCD, AVISIT) %>% summarize(across(starts_with("x_"), mean), .groups = "drop") mapping <- grid %>% select(starts_with("x_")) %>% as.matrix() colnames(mapping) <- gsub("^x_", "", colnames(mapping)) rownames(mapping) <- paste0(grid$ARMCD, "_", grid$AVISIT) # Use the mapping to transform model coefficients to marginal means: custom <- mapping %*% coefficients(model) marginals_custom <- tibble::tibble( marginal = rownames(custom), custom = as.numeric(custom) ) # Comparison comparison <- dplyr::left_join( marginals_emmeans, marginals_custom, by = "marginal" ) %>% mutate(difference = custom - emmean) print(comparison) #> # A tibble: 8 × 4 #> marginal emmean custom difference #> #> 1 PBO_VIS1 -7.79 -7.79 0 #> 2 TRT_VIS1 -3.20 -3.20 0 #> 3 PBO_VIS2 -2.86 -2.86 0 #> 4 TRT_VIS2 1.73 1.73 0 #> 5 PBO_VIS3 2.89 2.89 1.33e-15 #> 6 TRT_VIS3 6.37 6.37 7.11e-15 #> 7 PBO_VIS4 7.67 7.67 8.88e-16 #> 8 TRT_VIS4 12.4 12.4 0 ``` Created on 2024-03-19 with [reprex v2.1.0](https://reprex.tidyverse.org)
wlandau commented 7 months ago

More follow-up: I can reproduce emmeans, even in a severely unbalanced design, when the interesting interactions do not overlap with the uninteresting ones. Below is a reprex with change ~ FEV1_BL + WEIGHT + SEX*RACE + ARMCD*AVISIT, which has both kinds of interactions:

``` r library(emmeans) library(tidyverse) # Use the FEV1 dataset from {mmrm} data(fev_data, package = "mmrm") data <- fev_data %>% mutate(change = FEV1 - FEV1_BL) %>% filter(!is.na(change)) # x <- c("PT100", "PT102", "PT103", "PT104", "PT112", "PT113", "PT114", # "PT115", "PT117", "PT118", "PT12", "PT120", "PT123", "PT125", # "PT131", "PT135", "PT140", "PT144", "PT145", "PT147", "PT149", # "PT17", "PT23", "PT31", "PT34", "PT37", "PT38", "PT39", "PT40", # "PT42", "PT45", "PT46", "PT49", "PT5", "PT53", "PT56", "PT64", # "PT69", "PT7", "PT70", "PT74", "PT8", "PT82", "PT87", "PT88", # "PT89", "PT90", "PT94", "PT96", "PT98") # data <- data[!(data$USUBJID %in% x), ] formula <- change ~ FEV1_BL + WEIGHT + SEX*RACE + ARMCD*AVISIT model <- lm(formula, data = data) # {emmeans} marginals marginals_emmeans <- emmeans::emmeans( object = model, specs = ~ARMCD:AVISIT, weights = "equal", nuisance = "USUBJID" ) %>% as.data.frame() %>% as_tibble() %>% mutate(marginal = paste0(ARMCD, "_", AVISIT)) %>% select(marginal, emmean) # Create a mapping from model parameters to marginal means, # attempting to pre-average over nuisance variables: model_matrix <- model.matrix(formula, data = data) model_matrix[, "FEV1_BL"] <- mean(model_matrix[, "FEV1_BL"]) model_matrix[, "WEIGHT"] <- mean(model_matrix[, "WEIGHT"]) model_matrix[, "SEXFemale"] <- 0.5 model_matrix[, "RACEBlack or African American"] <- 1 / 3 model_matrix[, "RACEWhite"] <- 1 / 3 model_matrix[, "SEXFemale:RACEBlack or African American"] <- 1 / 6 model_matrix[, "SEXFemale:RACEWhite"] <- 1 / 6 colnames(model_matrix) <- paste0("x_", colnames(model_matrix)) joined <- dplyr::bind_cols(data, model_matrix) grid <- joined %>% group_by(ARMCD, AVISIT) %>% summarize(across(starts_with("x_"), mean), .groups = "drop") mapping <- grid %>% select(starts_with("x_")) %>% as.matrix() colnames(mapping) <- gsub("^x_", "", colnames(mapping)) rownames(mapping) <- paste0(grid$ARMCD, "_", grid$AVISIT) # Use the mapping to transform model coefficients to marginal means: custom <- mapping %*% coefficients(model) marginals_custom <- tibble::tibble( marginal = rownames(custom), custom = as.numeric(custom) ) # Comparison comparison <- dplyr::left_join( marginals_emmeans, marginals_custom, by = "marginal" ) %>% mutate(difference = custom - emmean) print(comparison) #> # A tibble: 8 × 4 #> marginal emmean custom difference #> #> 1 PBO_VIS1 -6.98 -6.98 -7.11e-15 #> 2 TRT_VIS1 -3.11 -3.11 -7.11e-15 #> 3 PBO_VIS2 -2.20 -2.20 -7.11e-15 #> 4 TRT_VIS2 1.84 1.84 -7.11e-15 #> 5 PBO_VIS3 3.51 3.51 -7.11e-15 #> 6 TRT_VIS3 6.21 6.21 -7.11e-15 #> 7 PBO_VIS4 8.27 8.27 -7.11e-15 #> 8 TRT_VIS4 12.3 12.3 -7.11e-15 ``` Created on 2024-03-19 with [reprex v2.1.0](https://reprex.tidyverse.org)

On the other hand, when interactions are mixed as with SEX*ARMCD*VISIT, emmeans produces an ominous NOTE from emmeans:

library(emmeans)
library(tidyverse)
data(fev_data, package = "mmrm")
data <- fev_data %>%
  mutate(change = FEV1 - FEV1_BL) %>%
  filter(!is.na(change))
formula <- change ~ SEX*ARMCD*AVISIT
model <- lm(formula, data = data)
out <- emmeans::emmeans(
  object = model,
  specs = ~ARMCD:AVISIT,
  weights = "equal",
  nuisance = "USUBJID"
)
#> NOTE: Results may be misleading due to involvement in interactions

Created on 2024-03-19 with reprex v2.1.0

We are in this exact situation with subgroup analysis. So moving to a custom fitted-value-based might allow us to do analyses that would be invalid for the current emmeans-integrated version of brms.mmrm.

wlandau commented 7 months ago

Above, I managed to reproduce emmeans(weights = "cells") and emmeans(weights = "equal"). From this comment by the author of emmeans, it looks like these are two extremes:

The extremes are "equal" (the default), which is best for estimating the effects of one factor while holding the rest fixed in an experimental mode; and "cells", which in a linear model, basically reproduces the raw marginal means of the data but is least useful for quantifying effects and most vulnerable to confounding with effects of other factors.

On the one hand, this makes "cells" seem inappropriate for a clinical trial. On the other, not all factors are controlled in the experiment, so "equal" is not a great fit either.

brms.mmrm currently uses "proportional", which according to ?emmeans is:

Weight in proportion to the frequencies (in the original data) of the factor combinations that are averaged over.

as opposed to "cell":

Weight according to the frequencies of the cells being averaged.

I do not understand the explanation of "proportional". At this point, I think it's worth spending time to figure out what "proportional" and "outer" really mean.

wlandau commented 7 months ago

...and as soon as I posted that comment, I think I figured out what "proportional" means. Here is a reprex:

``` r library(emmeans) library(tidyverse) # Use the FEV1 dataset from {mmrm} data(fev_data, package = "mmrm") data <- fev_data %>% mutate(change = FEV1 - FEV1_BL) %>% filter(!is.na(change)) formula <- change ~ FEV1_BL + WEIGHT + RACE + SEX + ARMCD*AVISIT model <- lm(formula, data = data) # {emmeans} marginals marginals_emmeans <- emmeans::emmeans( object = model, specs = ~ARMCD:AVISIT, weights = "proportional" ) %>% as.data.frame() %>% as_tibble() %>% mutate(marginal = paste0(ARMCD, "_", AVISIT)) %>% select(marginal, emmean) # Create a mapping from model parameters to marginal means, # attempting to pre-average over nuisance variables: model_matrix <- model.matrix(formula, data = data) model_matrix[, "FEV1_BL"] <- mean(model_matrix[, "FEV1_BL"]) model_matrix[, "WEIGHT"] <- mean(model_matrix[, "WEIGHT"]) model_matrix[, "SEXFemale"] <- mean(model_matrix[, "SEXFemale"]) model_matrix[, "RACEBlack or African American"] <- mean(model_matrix[, "RACEBlack or African American"]) model_matrix[, "RACEWhite"] <- mean(model_matrix[, "RACEWhite"]) colnames(model_matrix) <- paste0("x_", colnames(model_matrix)) joined <- dplyr::bind_cols(data, model_matrix) grid <- joined %>% group_by(ARMCD, AVISIT) %>% summarize(across(starts_with("x_"), mean), .groups = "drop") mapping <- grid %>% select(starts_with("x_")) %>% as.matrix() colnames(mapping) <- gsub("^x_", "", colnames(mapping)) rownames(mapping) <- paste0(grid$ARMCD, "_", grid$AVISIT) # Use the mapping to transform model coefficients to marginal means: custom <- mapping %*% coefficients(model) marginals_custom <- tibble::tibble( marginal = rownames(custom), custom = as.numeric(custom) ) # Comparison comparison <- dplyr::left_join( marginals_emmeans, marginals_custom, by = "marginal" ) %>% mutate(difference = custom - emmean) print(comparison) #> # A tibble: 8 × 4 #> marginal emmean custom difference #> #> 1 PBO_VIS1 -7.32 -7.32 -4.44e-15 #> 2 TRT_VIS1 -3.45 -3.45 -4.44e-15 #> 3 PBO_VIS2 -2.56 -2.56 -4.44e-15 #> 4 TRT_VIS2 1.50 1.50 -4.44e-15 #> 5 PBO_VIS3 3.16 3.16 -4.44e-15 #> 6 TRT_VIS3 5.89 5.89 -3.55e-15 #> 7 PBO_VIS4 7.92 7.92 -4.44e-15 #> 8 TRT_VIS4 12.0 12.0 -3.55e-15 ``` Created on 2024-03-20 with [reprex v2.1.0](https://reprex.tidyverse.org)

The key here is that whereas weights = "equal" does this:

model_matrix[, "SEXFemale"] <- 0.5
model_matrix[, "RACEBlack or African American"] <- 1 / 3
model_matrix[, "RACEWhite"] <- 1 / 3

weights = "proportional" does this:

model_matrix[, "SEXFemale"] <- mean(model_matrix[, "SEXFemale"])
model_matrix[, "RACEBlack or African American"] <- mean(model_matrix[, "RACEBlack or African American"])
model_matrix[, "RACEWhite"] <- mean(model_matrix[, "RACEWhite"])

Furthermore, I can start to see how "cell" weights would confound the interesting marginal means with uninteresting covariates, leading to the Simpsons paradox situations that Russ Lenth describes. I am starting to come around to "proportional" because I know what it does now, and I like how it corrects from the balancing problem of weights = "equal". (Plus, brms.mmrm currently sets weights = "proportional" in emmeans()).

wlandau commented 7 months ago

Same reprex with an interaction between uninteresting variables:

``` r library(emmeans) library(tidyverse) # Use the FEV1 dataset from {mmrm} data(fev_data, package = "mmrm") data <- fev_data %>% mutate(change = FEV1 - FEV1_BL) %>% filter(!is.na(change)) formula <- change ~ FEV1_BL + WEIGHT + RACE*SEX + ARMCD*AVISIT model <- lm(formula, data = data) # {emmeans} marginals marginals_emmeans <- emmeans::emmeans( object = model, specs = ~ARMCD:AVISIT, weights = "proportional" ) %>% as.data.frame() %>% as_tibble() %>% mutate(marginal = paste0(ARMCD, "_", AVISIT)) %>% select(marginal, emmean) # Create a mapping from model parameters to marginal means, # attempting to pre-average over nuisance variables: model_matrix <- model.matrix(formula, data = data) model_matrix[, "FEV1_BL"] <- mean(model_matrix[, "FEV1_BL"]) model_matrix[, "WEIGHT"] <- mean(model_matrix[, "WEIGHT"]) model_matrix[, "SEXFemale"] <- mean(model_matrix[, "SEXFemale"]) model_matrix[, "RACEBlack or African American"] <- mean(model_matrix[, "RACEBlack or African American"]) model_matrix[, "RACEBlack or African American:SEXFemale"] <- mean(model_matrix[, "RACEBlack or African American:SEXFemale"]) model_matrix[, "RACEWhite"] <- mean(model_matrix[, "RACEWhite"]) model_matrix[, "RACEWhite:SEXFemale"] <- mean(model_matrix[, "RACEWhite:SEXFemale"]) colnames(model_matrix) <- paste0("x_", colnames(model_matrix)) joined <- dplyr::bind_cols(data, model_matrix) grid <- joined %>% group_by(ARMCD, AVISIT) %>% summarize(across(starts_with("x_"), mean), .groups = "drop") mapping <- grid %>% select(starts_with("x_")) %>% as.matrix() colnames(mapping) <- gsub("^x_", "", colnames(mapping)) rownames(mapping) <- paste0(grid$ARMCD, "_", grid$AVISIT) # Use the mapping to transform model coefficients to marginal means: custom <- mapping %*% coefficients(model) marginals_custom <- tibble::tibble( marginal = rownames(custom), custom = as.numeric(custom) ) # Comparison comparison <- dplyr::left_join( marginals_emmeans, marginals_custom, by = "marginal" ) %>% mutate(difference = custom - emmean) print(comparison) #> # A tibble: 8 × 4 #> marginal emmean custom difference #> #> 1 PBO_VIS1 -7.32 -7.32 -6.22e-15 #> 2 TRT_VIS1 -3.45 -3.45 -6.22e-15 #> 3 PBO_VIS2 -2.55 -2.55 -6.22e-15 #> 4 TRT_VIS2 1.50 1.50 -6.22e-15 #> 5 PBO_VIS3 3.17 3.17 -6.22e-15 #> 6 TRT_VIS3 5.86 5.86 -6.22e-15 #> 7 PBO_VIS4 7.93 7.93 -6.22e-15 #> 8 TRT_VIS4 12.0 12.0 -5.33e-15 ``` Created on 2024-03-20 with [reprex v2.1.0](https://reprex.tidyverse.org)

My only concern is when we have subgroup interactions with the treatment arm and visit ("NOTE: Results may be misleading due to involvement in interactions" as mentioned previously). We may need to require subgroup to be part of the reference grid when there is a subgroup/treatment and/or subgroup/treatment/time interaction in the model. I still haven't wrapped my head around that part yet.

wlandau commented 7 months ago

I will need to carefully read http://cran.nexr.com/web/packages/emmeans/vignettes/interactions.html, from at a glance:

emmeans() valiantly tries to warn you that it may not be a good idea to average over factors that interact with the factor of interest. It isn’t always a bad idea to do this, but sometimes it definitely is.

If that's the issue, then we just need to identify the factors of interest and avoid averaging over other factors that interact with them. We'll have to work out what those are in the case of subgroup and non-subgroup.

wlandau commented 7 months ago

Great points came out of today's meeting:

andrew-bean commented 7 months ago

Thanks @wlandau , this investigation is really helpful !

To the 3rd bullet point about dropout... I realized this has other interesting implications about what emmeans is doing for main effects of "uninteresting" covariates as you've called them. For example, you found that emmeans is doing something like:

mean(model_matrix[, "FEV1_BL"])

This is not actually the average baseline FEV, it is an average of that covariate over all the patient-visits appearing in the design matrix, making it a weighted average in which patients who drop out earlier carry less weight. I doubt this is the standard practice.

IMO when computing the reference grid, we should first complete the grid of patient-visits to include the missing ones. Otherwise the "margin" at which we report the mean response is an odd weighted average of the study population.

yonicd commented 7 months ago

@wlandau here is that ref paper I was referring to in our meeting. It is more focused on transformations of the marginal when you have non-linear link functions. Thankfully in our case we dont go into those corners.

I was thinking about the crux of our conversation yesterday that we need a more careful manual approach than depending on emmeans out of the box. I actually think that would be a mistake, we should be standing on the shoulders of the years of experience emmeans has instead of redoing their work. Understanding how their functions work and what different levers of the formals in functions do is obviously needed and you have done a great job digging into that.

One thing that may be a good middle ground is to use the emmGrid object inside emmeans to define specific grids that we want to template and use and then allow emmeans to deploy those grids. We could also take adv of the documentation on extending emmeans to our "model" type.

emtrends may also be a good solution for our interaction issue. it looks like the developers have already been through this mess :)

wlandau commented 7 months ago

40 was the biggest reason I was thinking about a fully custom approach, and a middle ground could serve the same purpose.

To recap #40: let's say our model is CHG ~ BASE + AVISIT * TRT01P, but we want Sebastian's prior on the differences in CHG between consecutive AVISITs. If we were to analytically transform the prior to align with the formula, then we would get a multivariate normal with a correlated parameters. That's a good thing because the prior can express our initial belief about the time course of the disease. However, as I believe @andrew-bean observed earlier, for a prior with correlated fixed effects, we would need to inject custom Stan code through the family argument of brms::brm(). I am not a fan of that approach because it seems like a hack/backdoor, and I think it would decrease our ability to trust and properly maintain the implementation.

Instead, I was hoping we could:

  1. Get the model matrix of the original formula.
  2. Linearly transform the model matrix to match Sebastian's prior on the CHG differences.
  3. Fit the model with the transformed model matrix to get posterior samples of CHG differences.
  4. Invert the linear transformation from (2) to convert these samples into samples of the original fixed effects.
  5. Use emmeans or its underlying method to convert the parameter samples from (4) into samples of marginal means.

From my understanding of conditional means priors, the math collapses down to these convenient linear transformations if our link function is linear. Admittedly, we would be fitting the model on change differences instead of the parameterization that the user requested in the formula. But if the transformation is linear and full-rank, then those parameter spaces should be exactly the same in a linear algebraic sense, so the overall result of the model should be the same.

I doubt emmeans alone would handle (5) because the model object from (3) does not have TRT01P or AVISIT as explicit variables. As far as I am aware, emmeans requires the model object itself to directly incorporate the factors that define the reference grid. One workaround could be to define a custom lm(), then as you suggested @yonicd, use emmGrid() to get the reference grid. I think that would connect all the dots.

As far as completely custom vs the middle ground, the emmeans "supersession plan" makes me nervous:

The developer of emmeans continues to maintain and occasionally add new features. However, none of us is immortal; and neither is software. I have thought of trying to find a co-maintainer who could carry the ball once I am gone or lose interest, but the flip side of that is that the codebase is not getting less messy as time goes on -- why impose that on someone else? So my thought now is that if at some point, enough active R developers want the capabilities of emmeans but I am no longer in the picture, they should feel free to supersede it with some other package that does it better. All of the code is publicly available on GitHub, so just take what is useful and replace what is not.

emmeans has so many reverse dependencies on CRAN, and internally, it is used in topline analysis models in real clinical trials. I hope this will motivate a new maintainer or an equivalent package when the time comes, but I am not so sure. Maybe for us, we go for the middle ground for now, then switch to something else or a custom method if emmeans is ever abandoned. In either case, I think we should understand what it does every step of the way.

wlandau commented 6 months ago

Also: I will have to check, but emmeans might not be able to do the carefully fine-tuned kind of averaging we talked about:

  1. Proportional averaging for some uninteresting factors and "cells" averaging for others (in the case of interactions with interesting factors).
  2. Proportional averaging within subgroup levels.

As far as I know, weights = "proportional" applies to everything that is not in specs. If we have to supply custom numeric weights to emmeans anyway, I think we might as well go fully custom.

yonicd commented 6 months ago

It may be the case that "proportional" is just shorthand for a emmGrid setup. We can probably construct complex grid templates.

wlandau commented 6 months ago

Yes, weights = "proportional" looks like shorthand. From the documentation of the weights argument from ?emmeans::emmeans:

Also, if object is not already a reference grid, weights (if it is character) is passed to ref_grid as wt.nuis in case nuisance factors are specified.

And from ?emmeans::ref_grid:

If nuisance is a vector of predictor names, those predictors are omitted from the reference grid. Instead, the result will be as if we had averaged over the levels of those factors, with either equal or proportional weights as specified in wt.nuis.

I tried to dig further into the actual computation of marginal means from the fixed effect estimates. Some aspects I think I understand, but others are buried too deep inside code that was not meant to be readable. (Details in the collapsible block below.)

It seems `ref_grid()` calls `emm_basis()`, which then constructs a model matrix that maps fixed effects to marginal means. (Sort of, there are a couple details that I don't understand just from browsing [`emm_basis.lm()`](https://github.com/rvlenth/emmeans/blob/ada2af94c2c6b03df38679473f11968979d99157/R/helpers.R#L39) with the FEV data.) Then `summary.emmGrid()` calls [`.est.se.df()`](https://github.com/rvlenth/emmeans/blob/ada2af94c2c6b03df38679473f11968979d99157/R/summary.R#L871), which *actually* computes `X*beta` to *actually* calculate the marginal means. There is also a [.basis.nuis](https://github.com/rvlenth/emmeans/blob/ada2af94c2c6b03df38679473f11968979d99157/R/ref-grid.R#L1093) which actually computes the weights, but somehow it relies on the mysterious rows appended to the end of the model matrix, and that apparently is the mechanism of non-"equal" averaging methods such as "proportional".

In any case, ?emm_basis says:

A user or package developer may add emmeans support for a model class by writing recover_data and emm_basis methods for that class.

So it seems we are encouraged to write our own "basis" algorithm, which includes the actual matrix that maps fixed effects to marginal means. This is the crux of the whole process, and in his vignettes and posts, Russ Lent himself encourages users to think carefully about the details of EMM calculation. All this makes me think a custom approach would not be too reckless.

There would be value in actually writing an actual emm_basis() S3 method if we wanted users to call emmeans() on special classed brms.mmrm output, but this is not part of the interface, so I don't think we need it.

wlandau commented 6 months ago

There is also a .basis.nuis which actually computes the weights, but somehow it relies on the mysterious rows appended to the end of the model matrix, and that apparently is the mechanism of non-"equal" averaging methods such as "proportional".

Digging even more, these extra appended rows come from .setup.nuis(). They are only used to translate the averaging weights to the model matrix in .basis.nuis(), and then .basis.nuis() removes those rows right away.

The following details show just how weights = "proportional" is a simple calculation buried deep in layers upon layers of messy code. This stuff is also direct confirmation that emmeans cannot proportionally average within levels of a subgroup, and it cannot use proportional averaging for only a subset of nuisance variables. A custom approach will allow us to do both, as well as tailor the nuances of interactions to our use case.

`.setup.nuis()` generates an object called `grid`. It contains the actual reference grid we want, plus one extra row at the bottom for each level of each nuisance factor. For FEV: ```r > grid FEV1_BL WEIGHT RACE SEX ARMCD AVISIT 1 40.23596 0.5175602 Asian Male PBO VIS1 2 40.23596 0.5175602 Asian Male TRT VIS1 3 40.23596 0.5175602 Asian Male PBO VIS2 4 40.23596 0.5175602 Asian Male TRT VIS2 5 40.23596 0.5175602 Asian Male PBO VIS3 6 40.23596 0.5175602 Asian Male TRT VIS3 7 40.23596 0.5175602 Asian Male PBO VIS4 8 40.23596 0.5175602 Asian Male TRT VIS4 9 40.23596 0.5175602 Asian Male PBO VIS1 10 40.23596 0.5175602 Black or African American Male PBO VIS1 11 40.23596 0.5175602 White Male PBO VIS1 12 40.23596 0.5175602 Asian Male PBO VIS1 13 40.23596 0.5175602 Asian Female PBO VIS1 ``` Then, `emm_basis.lm()` constructs a model matrix using the "grid" object above as the data: ```r > X (Intercept) FEV1_BL WEIGHT RACEBlack or African American RACEWhite SEXFemale ARMCDTRT AVISITVIS2 1 1 40.23596 0.5175602 0 0 0 0 0 2 1 40.23596 0.5175602 0 0 0 1 0 3 1 40.23596 0.5175602 0 0 0 0 1 4 1 40.23596 0.5175602 0 0 0 1 1 5 1 40.23596 0.5175602 0 0 0 0 0 6 1 40.23596 0.5175602 0 0 0 1 0 7 1 40.23596 0.5175602 0 0 0 0 0 8 1 40.23596 0.5175602 0 0 0 1 0 9 1 40.23596 0.5175602 0 0 0 0 0 10 1 40.23596 0.5175602 1 0 0 0 0 11 1 40.23596 0.5175602 0 1 0 0 0 12 1 40.23596 0.5175602 0 0 0 0 0 13 1 40.23596 0.5175602 0 0 1 0 0 AVISITVIS3 AVISITVIS4 ARMCDTRT:AVISITVIS2 ARMCDTRT:AVISITVIS3 ARMCDTRT:AVISITVIS4 1 0 0 0 0 0 2 0 0 0 0 0 3 0 0 0 0 0 4 0 0 1 0 0 5 1 0 0 0 0 6 1 0 0 1 0 7 0 1 0 0 0 8 0 1 0 0 1 9 0 0 0 0 0 10 0 0 0 0 0 11 0 0 0 0 0 12 0 0 0 0 0 13 0 0 0 0 0 ``` Then, `.basis.nuis()` calculates the actual weights for nuisance variables. For `weights = "equal"`, this is `1/length(levels(x))` for each nuisance factor `x`. Otherwise, the numeric weights are the frequency that each level of each nuisance variable appears in the data. For example: ```r > str(wts) List of 2 $ RACE: Named num [1:3] 0.372 0.365 0.263 ..- attr(*, "names")= chr [1:3] "Asian" "Black or African American" "White" $ SEX : Named num [1:2] 0.464 0.536 ..- attr(*, "names")= chr [1:2] "Male" "Female" ``` Then, `.basis.nuis()` uses `grid` to assign the weights to the respective columns of X, then just take the rows of `X` that actually belong to the reference grid we want (rows 1-8). ```r > basis$X (Intercept) FEV1_BL WEIGHT RACEBlack or African American RACEWhite SEXFemale ARMCDTRT AVISITVIS2 1 1 40.23596 0.5175602 0.3649907 0.2625698 0.5363128 0 0 2 1 40.23596 0.5175602 0.3649907 0.2625698 0.5363128 1 0 3 1 40.23596 0.5175602 0.3649907 0.2625698 0.5363128 0 1 4 1 40.23596 0.5175602 0.3649907 0.2625698 0.5363128 1 1 5 1 40.23596 0.5175602 0.3649907 0.2625698 0.5363128 0 0 6 1 40.23596 0.5175602 0.3649907 0.2625698 0.5363128 1 0 7 1 40.23596 0.5175602 0.3649907 0.2625698 0.5363128 0 0 8 1 40.23596 0.5175602 0.3649907 0.2625698 0.5363128 1 0 AVISITVIS3 AVISITVIS4 ARMCDTRT:AVISITVIS2 ARMCDTRT:AVISITVIS3 ARMCDTRT:AVISITVIS4 1 0 0 0 0 0 2 0 0 0 0 0 3 0 0 0 0 0 4 0 0 1 0 0 5 1 0 0 0 0 6 1 0 0 1 0 7 0 1 0 0 0 8 0 1 0 0 1 ``` This is a really awkward approach. It is difficult to follow, "grid" is trying to do too many things at once, and it fills values in the data frame that should be ignored. The main grid and each of the variable-specific grids should all be different objects.
wlandau commented 6 months ago

Regarding:

emtrends may also be a good solution for our interaction issue. it looks like the developers have already been through this mess :)

Does this fit our use case? From the description of the function:

The emtrends function is useful when a fitted model involves a numerical predictor x interacting with another predictor a (typically a factor). Such models specify that x has a different trend depending on a; thus, it may be of interest to estimate and compare those trends.

and from the documentation of the var argument:

In order for this to be useful, var should be a numeric predictor that interacts with at least one factor in specs.

By contrast, BASELINE:AVISIT is a "nuisance" interaction for brms.mmrm. We only care about AVISIT:TREATMENT and AVISIT:TREATMENT:SUBGROUP.

wlandau commented 6 months ago

Looking back at:

One thing that may be a good middle ground is to use the emmGrid object inside emmeans to define specific grids that we want to template and use and then allow emmeans to deploy those grids.

@yonicd, would you elaborate on what would go into the template and what would be left to emmeans to do?

wlandau commented 6 months ago

and it cannot use proportional averaging for only a subset of nuisance variables.

I was hoping a custom averaging method might allow us to average over interaction terms, but now I think I was wrong about that. From the emmeans interactions vignette:

it may not be a good idea to average over factors that interact with the factor of interest. It isn’t always a bad idea to do this, but sometimes it definitely is.

and in the summary:

One must resist pressures and inclinations to try to produce simple bottom-line conclusions.

The author makes these statements in the general sense: regardless of the averaging method used ("proportional" or otherwise) and regardless of whether emmeans is used. The whole message of the "Interacting factors" section is that we should understand the modeled interactions (e.g. with interaction plots) before marginalizing them out post-hoc.

In brms.mmrm, there are several potential sources of trouble:

  1. The baseline-time interaction term in the formula (baseline_time argument of brm_formula()).
  2. The baseline-subgroup interaction.
  3. The baseline-subgroup-time interaction.
  4. Marginalizing out the subgroup even when subgroup was modeled (use_subgroup = FALSE in brm_marginal_draws()).
  5. Applying brm_marginal_draws() to a brms model not fit with brms.mmrm.

I think we should completely disable (4) and (5). (For (4), the model comparisons in https://openpharma.github.io/brms.mmrm/articles/subgroup.html#visualization can simply condition on a single subgroup level rather than marginalize over all the levels.)

I think we should still allow (1)-(3) for modeling, but prevent brm_marginal_draws() from post-processing models with these interactions. This allows people to still model the extra interactions and compute their own summaries.

All this makes the marginal mean calculation in brm_marginal_draws() much simpler. I propose:

A. Complete the missingness grid as @andrew-bean recommended in https://github.com/openpharma/brms.mmrm/issues/53#issuecomment-2010543475. B. Proportionally average over all uninteresting factors. C. By default, predict at the mean of each numeric variable. D. As @yonicd suggested in the last meeting, allow users to pick custom reference levels for (C).

I will start working on separate pull requests the guardrails (1-5) and marginal means (A-D).

wlandau commented 6 months ago

@andrew-bean, would you help me understand the rationale for averaging covariates within subgroup levels? This approach sounded nice during the last meeting, but now I worry that it might cause the exact same confounding/Simpson's paradox as weights = "cells".

wlandau commented 6 months ago

I was thinking about the crux of our conversation yesterday that we need a more careful manual approach than depending on emmeans out of the box. I actually think that would be a mistake, we should be standing on the shoulders of the years of experience emmeans has instead of redoing their work.

emmeans has indeed spent years refining subtle edge cases and unknown unknowns, but with the guardrails in https://github.com/openpharma/brms.mmrm/issues/53#issuecomment-2021126575, I think brms.mmrm will only encounter the simplest cases. Proportional averaging over non-interacting factors is easy, and we can code it more cleanly and transparently than emmeans.

wlandau commented 6 months ago

Proportional averaging over non-interacting factors is easy

And from reading "Interacting Factors", emmeans is not going to help us marginalize out interactions anyway. It would be unwise to recommend it in the general case, with or without emmeans.

wlandau commented 6 months ago

Hmm... interacting continuous covariates are different from interacting factors. From the vignette:

When a covariate and a factor interact, we typically don’t want EMMs themselves, but rather estimates of slopes of the covariate trend for each level of the factor.

For those interactions with baseline, it seems emtrends() may be more relevant than I originally thought.

wlandau commented 6 months ago

I just submitted #83 to get us started. This one just has the most straightforward guardrails: avoid marginalizing over the subgroup, and make sure the model we're summarizing was actually fit with brms.mmrm.

wlandau commented 6 months ago

@andrew-bean, I forgot to mention: I know my earlier reprexes didn't do this, but brm_data() already fills in rows for implicitly missing responses. So if our marginal mean calculations have at least some degree of customization, we can cover https://github.com/openpharma/brms.mmrm/issues/53#issuecomment-2010543475 easily.

wlandau commented 6 months ago

Good news: I figured out how to reproduce emmeans in the presence of baseline-time interactions! It's really easy and intuitive: in the model matrix, let the FEV1_BL interaction terms equal mean(FEV1_BL) only for their nonzero elements. model.matrix() takes care of this automatically. A reprex is in the "Details" block:

``` r library(emmeans) library(tidyverse) # Use the FEV1 dataset from {mmrm} data(fev_data, package = "mmrm") data <- fev_data %>% mutate(change = FEV1 - FEV1_BL) %>% filter(!is.na(change)) formula <- change ~ FEV1_BL + FEV1_BL:AVISIT + ARMCD*AVISIT model <- lm(formula, data = data) # {emmeans} marginals marginals_emmeans <- emmeans::emmeans( object = model, specs = ~ARMCD:AVISIT, weights = "proportional" ) %>% as.data.frame() %>% as_tibble() %>% mutate(marginal = paste0(ARMCD, "_", AVISIT)) %>% select(marginal, emmean) # Create a mapping from model parameters to marginal means, # attempting to pre-average over nuisance variables: grid <- data %>% dplyr::distinct(AVISIT, ARMCD) mapping <- grid %>% mutate(change = 0, FEV1_BL = mean(data$FEV1_BL)) %>% model.matrix(object = formula) %>% as.matrix() rownames(mapping) <- paste0(grid$ARMCD, "_", grid$AVISIT) print(mapping) #> (Intercept) FEV1_BL ARMCDTRT AVISITVIS2 AVISITVIS3 AVISITVIS4 #> TRT_VIS2 1 40.23596 1 1 0 0 #> TRT_VIS4 1 40.23596 1 0 0 1 #> PBO_VIS2 1 40.23596 0 1 0 0 #> PBO_VIS3 1 40.23596 0 0 1 0 #> PBO_VIS4 1 40.23596 0 0 0 1 #> TRT_VIS1 1 40.23596 1 0 0 0 #> PBO_VIS1 1 40.23596 0 0 0 0 #> TRT_VIS3 1 40.23596 1 0 1 0 #> FEV1_BL:AVISITVIS2 FEV1_BL:AVISITVIS3 FEV1_BL:AVISITVIS4 #> TRT_VIS2 40.23596 0.00000 0.00000 #> TRT_VIS4 0.00000 0.00000 40.23596 #> PBO_VIS2 40.23596 0.00000 0.00000 #> PBO_VIS3 0.00000 40.23596 0.00000 #> PBO_VIS4 0.00000 0.00000 40.23596 #> TRT_VIS1 0.00000 0.00000 0.00000 #> PBO_VIS1 0.00000 0.00000 0.00000 #> TRT_VIS3 0.00000 40.23596 0.00000 #> AVISITVIS2:ARMCDTRT AVISITVIS3:ARMCDTRT AVISITVIS4:ARMCDTRT #> TRT_VIS2 1 0 0 #> TRT_VIS4 0 0 1 #> PBO_VIS2 0 0 0 #> PBO_VIS3 0 0 0 #> PBO_VIS4 0 0 0 #> TRT_VIS1 0 0 0 #> PBO_VIS1 0 0 0 #> TRT_VIS3 0 1 0 #> attr(,"assign") #> [1] 0 1 2 3 3 3 4 4 4 5 5 5 #> attr(,"contrasts") #> attr(,"contrasts")$AVISIT #> [1] "contr.treatment" #> #> attr(,"contrasts")$ARMCD #> [1] "contr.treatment" # Use the mapping to transform model coefficients to marginal means: custom <- mapping %*% coefficients(model) marginals_custom <- tibble::tibble( marginal = rownames(custom), custom = as.numeric(custom) ) # Comparison comparison <- dplyr::left_join( marginals_emmeans, marginals_custom, by = "marginal" ) %>% mutate(difference = custom - emmean) print(comparison) #> # A tibble: 8 × 4 #> marginal emmean custom difference #> #> 1 PBO_VIS1 -7.80 -7.80 0 #> 2 TRT_VIS1 -3.22 -3.22 0 #> 3 PBO_VIS2 -2.84 -2.84 0 #> 4 TRT_VIS2 1.71 1.71 0 #> 5 PBO_VIS3 2.88 2.88 0 #> 6 TRT_VIS3 6.45 6.45 -8.88e-16 #> 7 PBO_VIS4 7.64 7.64 0 #> 8 TRT_VIS4 12.5 12.5 0 ``` Created on 2024-03-27 with [reprex v2.1.0](https://reprex.tidyverse.org)

Given how trivial this is, I don't think we need emtrends(). And we can safely keep all those interactions that involve baseline.

I did find something strange though: emmeans removes all the rows with missing values before it does its calculation. Even when I "fill" these rows beforehand, emmeans removes them. This leads to a different mean(FEV1_BL), which gives a different result. To align with emmeans, I had to manually remove rows with missing responses. I think a custom approach is a great opportunity to change this, c.f. https://github.com/openpharma/brms.mmrm/issues/53#issuecomment-2010543475

wlandau commented 6 months ago

I did find something strange though: emmeans removes all the rows with missing values before it does its calculation. Even when I "fill" these rows beforehand, emmeans removes them.

If we wanted to stick with emmeans, we could avoid this problem by temporarily setting the missing responses in the data to something non-missing.

Given how simple this all turned out to be, I would be open to separating the issue of replicating emmeans from the issue of taking manual control over the reference grid. The former is not urgent, and it will be easy to do the latter using emmeans.

yonicd commented 6 months ago

Sorry was out of pocket for a bit.

What I meant re leveraging {emmeans} was three parts.

  1. They have a much bigger infrastructure for post processing and integration with down stream packages that take advantage of {emmeans} class objects, we may be opening ourselves to much more work in the future if we want to start to integrate {brms.mmrm} to other ecosystems.
  2. {emmeans} already has a community around it and adding {brms.mmrm} to the models would give the work a much broader audience than what we currently can do atm. It would make our lives easier to take advantage of this. I definitely have "discovered" new packages for modeling through this list and go to it when i a researching for solutions to new problems.
  3. {mmrm} has {emmeans} integration already, it may be a plus to have a common api for this across the org packages wrt the marginal estimates.

If it isnt too much of twisting ourselves into pretzel shape to work within the {emmeans} framework I think there are definite advantages to it.

wlandau commented 6 months ago

Fortunately, brms.mmrm already integrates emmeans through brms. In fact, this is what brm_marginal_draws() currently uses to get posterior draws of marginal means (c.f. here and here in the code.)

Let's say a user fits a model with brms.mmrm:

library(tidyverse)
library(brms.mmrm)
data(fev_data, package = "mmrm")
data <- fev_data %>%
  mutate("FEV1_CHG" = FEV1 - FEV1_BL) %>%
  brm_data(
    outcome = "FEV1_CHG",
    role = "change",
    group = "ARMCD",
    time = "AVISIT",
    patient = "USUBJID",
    baseline = "FEV1_BL",
    reference_group = "PBO",
    covariates = c("RACE", "SEX")
  )
formula <- brm_formula(
  data = fev_data,
  intercept = TRUE,
  baseline = TRUE,
  group = TRUE,
  time = TRUE,
  baseline_time = TRUE,
  group_time = TRUE,
  correlation = "unstructured"
)
print(formula)
#> FEV1_CHG ~ FEV1_BL + FEV1_BL:AVISIT + ARMCD + ARMCD:AVISIT + AVISIT + RACE + SEX + unstr(time = AVISIT, gr = USUBJID) 
#> sigma ~ 0 + AVISIT
model <- brm_model(data = data, formula = formula)

Because the output is a brms model fit, the user can already leverage emmeans without extra work from us. This is true regardless of whether we use emmeans or a custom solution to get our own marginal means in brm_maginal_draws().

emmeans(
  model,
  specs = ~ARMCD:AVISIT,
  nuisance = c("RACE", "SEX")
)
#>  ARMCD AVISIT emmean lower.HPD upper.HPD
#>  PBO   VIS1   -26.05     -30.5    -21.23
#>  TRT   VIS1   -22.01     -25.7    -18.30
#>  PBO   VIS2   -21.26     -24.9    -17.40
#>  TRT   VIS2   -17.25     -20.6    -13.54
#>  PBO   VIS3   -15.62     -19.3    -12.15
#>  TRT   VIS3   -12.60     -16.1     -9.08
#>  PBO   VIS4   -10.87     -14.8     -6.27
#>  TRT   VIS4    -6.42     -10.2     -1.88
#> 
#> Results are averaged over the levels of: 2 nuisance factors 
#> Point estimate displayed: median 
#> HPD interval probability: 0.95 
yonicd commented 6 months ago

ok, great!

Then i think the only thing would be to have robust documentation on how brm_maginal_draws differ from direct use of {emmeans}. Which is basically collating the comments in this issue into a vignette. You have done an amazing job doing the deep dive into emmeans

wlandau commented 6 months ago

Thanks! Yes, I am planning to either add to https://openpharma.github.io/brms.mmrm/articles/methods.html or write a separate vignette to explain the whole calculation from first principles, both what emeans does and what we do differently. The only differences I foresee are:

  1. Pre-filling in data rows where there are implicitly missing responses, c.f. https://github.com/openpharma/brms.mmrm/issues/53#issuecomment-2010543475.
  2. Maybe averaging within subgroup levels, although in hindsight I think this might get us into the same kind of Simpson's paradox trouble as weights = "cells".
wlandau commented 6 months ago

The custom approach is as easy as I thought, and it will make for a nice readable vignette:

library(brms)
library(posterior)
library(tibble)

# Apply proportional averaging to nuisance factors.
proportional_factors <- brmsformula(FEV1_CHG ~ 0 + SEX + RACE) %>%
  make_standata(data = data) %>%
  .subset2("X") %>%
  colMeans() %>%
  t()
print(proportional_factors)
#>      SEXMale SEXFemale RACEBlackorAfricanAmerican RACEWhite
#> [1,]    0.47      0.53                      0.375     0.275

# Construct a reference grid with interesting factors and the
# means of numeric variables.
grid <- data %>%
  mutate(FEV1_BL = mean(FEV1_BL), FEV1_CHG = 0) %>%
  distinct(ARMCD, AVISIT, FEV1_BL, FEV1_CHG)
print(grid)
#> # A tibble: 8 × 4
#>   ARMCD AVISIT FEV1_BL FEV1_CHG
#>   <chr> <chr>    <dbl>    <dbl>
#> 1 PBO   VIS1      40.2        0
#> 2 PBO   VIS2      40.2        0
#> 3 PBO   VIS3      40.2        0
#> 4 PBO   VIS4      40.2        0
#> 5 TRT   VIS1      40.2        0
#> 6 TRT   VIS2      40.2        0
#> 7 TRT   VIS3      40.2        0
#> 8 TRT   VIS4      40.2        0

# Get the draws of the model parameters and align the column names
# with the mapping.
draws_parameters <- model %>%
  as_draws_df() %>%
  as_tibble() %>%
  select(starts_with("b_"), -starts_with("b_sigma")) %>%
print(head(draws_parameters))
#>      b_Intercept  b_FEV1_BL b_ARMCDTRT b_AVISITVIS2 b_AVISITVIS3 b_AVISITVIS4 b_RACEBlackorAfricanAmerican b_RACEWhite b_SEXFemale b_FEV1_BL:AVISITVIS2 b_FEV1_BL:AVISITVIS3 b_FEV1_BL:AVISITVIS4 b_ARMCDTRT:AVISITVIS2 b_ARMCDTRT:AVISITVIS3 b_ARMCDTRT:AVISITVIS4
#> [1,]    28.12049 -0.9328434   3.482917   -0.9652279     3.501594    16.134916                    1.3795105    4.566933  0.59740456          0.150796399          0.171189552          -0.03216017            0.20417157            -0.4202866            1.29446708
#> [2,]    29.96148 -0.9703349   4.143874   -1.9744873     6.878381    11.069597                    1.7366256    5.587970 -0.61139642          0.161982927          0.075364839           0.13104548            0.03335486            -1.1919374           -0.51673961
#> [3,]    28.04120 -0.9167395   5.028615    0.7848824     8.312772     7.970771                    2.2455201    4.290807 -0.87864402          0.114844584          0.049232866           0.16065948           -0.61784967            -1.7812204            1.04428585
#> [4,]    27.94835 -0.9296524   2.613377   -0.6260921     5.259358     7.593719                    0.2306152    5.926638  1.58860856          0.119618736          0.115816445           0.18186057            1.31263326            -0.3219806            0.54245828
#> [5,]    24.42265 -0.8568065   3.547295    6.0542027    10.727494    12.397730                    1.7849814    5.238166  0.01330715         -0.005420136          0.008823189           0.06752929           -0.47946851            -0.3092742           -0.02700641
#> [6,]    23.31398 -0.8203084   3.128613    5.1667634     9.398452     8.762476                    1.6365735    5.486014  0.91430527         -0.020211246          0.022233059           0.15643241            1.18707182            -0.2188128           -1.08372968

# Create the mapping from model parameters to marginal means.
# Make sure to align the column names with those of draws_parameters.
mapping <- brmsformula(FEV1_CHG ~ FEV1_BL + FEV1_BL:AVISIT + ARMCD + ARMCD:AVISIT + AVISIT) %>%
  make_standata(data = grid) %>%
  .subset2("X") %>%
  bind_cols(proportional_factors) %>%
  setNames(paste0("b_", colnames(.)))
stopifnot(all(colnames(draws_parameters) %in% colnames(mapping)))
mapping <- as.matrix(mapping)[, colnames(draws_parameters)]
rownames(mapping) <- paste(grid$ARMCD, grid$AVISIT)
print(mapping)
#>          b_Intercept b_FEV1_BL b_ARMCDTRT b_AVISITVIS2 b_AVISITVIS3 b_AVISITVIS4 b_RACEBlackorAfricanAmerican b_RACEWhite b_SEXFemale b_FEV1_BL:AVISITVIS2 b_FEV1_BL:AVISITVIS3 b_FEV1_BL:AVISITVIS4 b_ARMCDTRT:AVISITVIS2 b_ARMCDTRT:AVISITVIS3 b_ARMCDTRT:AVISITVIS4
#> PBO VIS1           1  40.19072          0            0            0            0                        0.375       0.275        0.53              0.00000              0.00000              0.00000                     0                     0                     0
#> PBO VIS2           1  40.19072          0            1            0            0                        0.375       0.275        0.53             40.19072              0.00000              0.00000                     0                     0                     0
#> PBO VIS3           1  40.19072          0            0            1            0                        0.375       0.275        0.53              0.00000             40.19072              0.00000                     0                     0                     0
#> PBO VIS4           1  40.19072          0            0            0            1                        0.375       0.275        0.53              0.00000              0.00000             40.19072                     0                     0                     0
#> TRT VIS1           1  40.19072          1            0            0            0                        0.375       0.275        0.53              0.00000              0.00000              0.00000                     0                     0                     0
#> TRT VIS2           1  40.19072          1            1            0            0                        0.375       0.275        0.53             40.19072              0.00000              0.00000                     1                     0                     0
#> TRT VIS3           1  40.19072          1            0            1            0                        0.375       0.275        0.53              0.00000             40.19072              0.00000                     0                     1                     0
#> TRT VIS4           1  40.19072          1            0            0            1                        0.375       0.275        0.53              0.00000              0.00000             40.19072                     0                     0                     1

# Transform model parameters to marginal means.
draws_marginal_means <- draws_parameters %*% t(mapping) %>%
  as.data.frame() %>%
  as_tibble()
colnames(draws_marginal_means) <- paste(grid$ARMCD, grid$AVISIT)
print(draws_marginal_means)
#> # A tibble: 4,000 × 8
#>    `PBO VIS1` `PBO VIS2` `PBO VIS3` `PBO VIS4` `TRT VIS1` `TRT VIS2` `TRT VIS3` `TRT VIS4`
#>         <dbl>      <dbl>      <dbl>      <dbl>      <dbl>      <dbl>      <dbl>      <dbl>
#>  1      -7.28      -2.19       3.10       7.56      -3.80     1.50         6.16       12.3
#>  2      -7.17      -2.64       2.73       9.16      -3.03     1.54         5.69       12.8
#>  3      -7.25      -1.85       3.04       7.18      -2.22     2.56         6.29       13.3
#>  4      -6.86      -2.68       3.06       8.05      -4.24     1.25         5.35       11.2
#>  5      -7.90      -2.06       3.19       7.22      -4.35     1.01         6.42       10.7
#>  6      -7.05      -2.69       3.24       8.00      -3.92     1.62         6.15       10.0
#>  7      -7.56      -2.35       3.19       8.41      -5.35    -0.0937       6.84       10.5
#>  8      -7.07      -1.66       2.70       9.97      -2.17     0.632        6.26       14.6
#>  9      -7.90      -4.11       3.55       8.21      -3.76     1.34         4.81       11.4
#> 10      -8.81      -4.60       3.03       7.59      -3.55     2.03         5.94       11.0
#> # ℹ 3,990 more rows
#> # ℹ Use `print(n = ...)` to see more rows
wlandau commented 6 months ago

By contrast, if we were to use emmeans directly, then we would need to call emm_basis() manually. As far as I can tell, this is the only way to get the equivalent of the mapping matrix above. It should be emm_basis()$X, but when I try it, it is giving me the wrong answer. I think it's because the grid argument needs work. ?emm_basis says it comes from ref_grid(), and ?brms::emmeans-brms-helpers has no useful information. All this feels more cryptic and error-prone than I would like it to be.

custom_xlev <- lapply(select(data, AVISIT, ARMCD, SEX, RACE), function(x) unique(as.character(x)))
custom_trms <- terms(x = ~ FEV1_BL + FEV1_BL:AVISIT + ARMCD + ARMCD:AVISIT + AVISIT + RACE + SEX, data = data)
custom_grid <- ref_grid(
  model,
  specs = ~ARMCD:AVISIT,
  nuisance = c("RACE", "SEX")
)
emm_basis(
  object = model,
  trms = custom_trms,
  xlev = custom_xlev,
  grid = custom_grid, # I get the same result if I supply custom_grid@grid.
  resp = "FEV1_CHG"
)$X
#>      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
#> [1,]    1    0    0    0    0    0    0    0
#> [2,]    0    1    0    0    0    0    0    0
#> [3,]    0    0    1    0    0    0    0    0
#> [4,]    0    0    0    1    0    0    0    0
#> [5,]    0    0    0    0    1    0    0    0
#> [6,]    0    0    0    0    0    1    0    0
#> [7,]    0    0    0    0    0    0    1    0
#> [8,]    0    0    0    0    0    0    0    1
chstock commented 6 months ago

A few thoughts on older comments already from last week:

Even in an experimental setting, I do not think it is a good idea to give equal weights to all the levels of a factor.

For example: in an Alzheimer's study, it is common to regress on baseline disease severity. Those levels might be "mild" and "severe". Suppose the experiment has mostly "mild" patients and only a small number of "severe" ones. In the equal weighting scheme, there will be a marginal mean for "mild" and another marginal mean for "severe", and these means will be straightforwardly averaged together. But if there are far more "mild" patients than "severe" ones, this scheme risks misrepresenting the composition of the overall patient population. In particular, if the drug is only effective in the mild subpopulation, then equal weighting will artificially dampen the treatment effect.

I agree. In practice I have seen the "observed margins" option (in SAS) applied most frequently, which usually seemed very sensible in clinical trial settings.

It may be helpful to derive a strategy for marginal means estimation also with the estimand in mind. Very generally speaking, one would commonly be interested in some "average treatment effect in the population of trial participants". In such cases it seems to make sense that the "margins" reflect the observed distributions in the sample. Sometimes one might be interested in generalizing effects to a certain target population. Then a weighting based on external criteria would be needed.

Subgroups: "observed margins" usually relate to the margins in the total sample. I am not sure these are always best for subgroup analyses too (e.g. for sex subgroups with different baseline values, where both sex and baseline are prognostically relevant), but even then I believe they commonly provide a reasonable reference.

I can also imagine that it makes a difference whether the interest lies (more) on estimates of the marginal means, e.g. for a certain treatment arm or subgroup, or on their contrasts. In the case of the latter it seems critical to keep an eye on interpretability of this contrast. One may also want a certain simplicity here and avoid creating (difficult to interpret) imbalances.

  • For subgroup models, @andrew-bean suggested proportional averaging within each level of the subgroup, rather than over the whole dataset.

I am not sure in such case the resulting contrast of the marginal means is what one is looking for. Usually we may want to keep everything fixed at certain level and vary only the treatment when interpreting the contrast.

Another example: we often regress on the choice of investigative site in multi-center studies. The site shouldn't matter for inference, we just want to explain variability in the model. If one or two sites have the most patients, and most sites have a very small number of patients, what would make the most sense to do? It seems more intuitive to me to let the number of patients in each site determine how the sites are weighted, as opposed to averaging all sites equally.

Center effects - also an interesting topic. If heterogeneity w.r.t. treatment effects is assumed, the center variable in an MMRM would also be involved in interactions (with treatment and/or visit) ...but on the other hand, you wouldn't be interested in making inferences about center effects...(?) Hhm, if there were many sites with many patients, this sounds like a justification for mixed model with G- and R-side random effects (I have often wondered what an application for such a model might be; out of scope here, I know).

https://stats.stackexchange.com/questions/510862/is-least-squares-means-lsmeans-statistical-nonsense/510923#510923 is edifying, and it helps me understand more about why estimated marginal means with equal weights are used, as well as their limitations (e.g. interactions).

+1 ...yes, ...and even though experimental, the longitudinal design with the need for interactions (with time/ visit) makes the marginals means more challenging, both conceptually and technically....