mcdonohue / longpower

longpower: Sample Size Calculations for Longitudinal Data
2 stars 4 forks source link

simulation / validation vignette #2

Open ncullen93 opened 2 years ago

ncullen93 commented 2 years ago

Hey Mike,

Just wanted to throw in a little example code that could maybe turn into a vignette. This example uses ADNI data to grab pilot estimates of covariance, etc from an lme model, then simulates data based on those estimates and calculates power to detect a treatment effect over 1000 trials. This is compared to the closed-form estimate from longpower. I think it's a great way to validate the formulas and show how power and such can also be calculated using simulation which is more flexible.

#install.packages('~/Downloads/ADNIMERGE_0.0.1.tar.gz', repos=NULL, type='source')
library(ADNIMERGE)
library(nlme)
library(tidyverse)
library(mvtnorm)
library(longpower)

# 1: PROCESS DATA
df <- ADNIMERGE::adnimerge %>%
  filter(
    VISCODE %in% c('bl','m06','m12','m18', 'm24'),
    DX.bl %in% c('EMCI','LMCI'),
    as.numeric(ABETA.bl) < 880,
    AGE >= 55, AGE <= 85
  ) %>%
  tibble() %>%
  dplyr::select(
    RID, VISCODE, Years.bl, DX.bl, AGE, CDRSB
  ) %>%
  filter(complete.cases(.)) %>%
  mutate(
    month = factor(VISCODE, levels=c('bl','m06','m12','m18', 'm24'),
                   labels=c('0','6','12','18','24')),
    visit = as.numeric(month)
  ) %>%
  rename(
    YEARS_bl = Years.bl,
    DX_bl = DX.bl
  )

# 2: GET MODEL ESTIMATES FROM PILOT DATA
pilot_lme <- lme(CDRSB ~ YEARS_bl,
                 data = df,
                 random = ~ YEARS_bl | RID,
                 control = nlme::lmeControl(
                   maxIter = 1e10,
                   msMaxIter = 1000,
                   opt = "optim"
                 ),
                 method='ML')

# 3. SIMULATE DATA USING MODEL ESTIMATES
simulate_data <- function(fit_lme, n_per_arm, treatment_effect) {
  # get intercept-slope covariance matrix and residual error
  covmat <- unname(as.matrix(getVarCov(fit_lme)[1:2,1:2]))
  sigma_residual <- fit_lme$sigma
  Beta <- fixef(fit_lme)
  # treatment effect
  Beta['YEARS_bl:group'] <- -Beta['YEARS_bl'] * treatment_effect

  n_arm <- 2
  n <- n_arm * n_per_arm
  visits <- c(0, 0.5, 1, 1.5, 2)

  data <-
    data.frame(
      id = 1:n,
      group = rep(c(0,1), each=n/n_arm)
    ) %>%
    tibble() %>%
    bind_cols(
      rmvnorm(n, mean=c(0,0), sigma=covmat) %>%
        data.frame() %>%
        setNames(c('r_i', 'r_s'))
    ) %>%
    right_join(
      expand.grid(id = 1:n, YEARS_bl=visits),
      by = 'id'
    ) %>%
    select(c(id, YEARS_bl), everything()) %>%
    mutate(
      r_e = rnorm(n*length(visits), sd = sigma_residual)
    ) %>%
    arrange(id, YEARS_bl) %>%
    mutate(
      # fixed effects
      CDRSB = (model.matrix(~ YEARS_bl + YEARS_bl:group)[,names(Beta)] %*% Beta)[,1],
      # ranodm effects
      CDRSB = CDRSB + r_i + r_s*YEARS_bl + r_e
    )
  data
}

# longpower - calculate n for a given random power
res <- lmmpower(pilot_lme, power=0.47, pct.change=0.3, t=seq(0,2,0.5))
n_exp <- round(res$n[1], 0) #143
print(n_exp)

# simulation - calculate power for the previously calculated n (it should match)
sig_vec <- c()
ntrials <- 1000
for (i in 1:ntrials) {
  print(i)
  data_sim <- simulate_data(pilot_lme, n_per_arm=143, treatment_effect=0.3)
  x <- summary(lme(CDRSB~YEARS_bl+group:YEARS_bl,
                   data=data_sim,
                   random=~YEARS_bl|id,
                   control = nlme::lmeControl(
                     maxIter = 1e10,
                     msMaxIter = 1000,
                     opt = "optim"
                   ),
                   method='ML'))
  pval <- coef(x)['YEARS_bl:group','p-value']
  sig_vec <- c(sig_vec, pval)
}
pwr <- mean(sig_vec < 0.05) # 0.47 or something very close
print(pwr)
mcdonohue commented 2 years ago

Thanks, Nick.

Something like this would be very useful, but I don't think CRAN would allow a reference to ADNIMERGE (since it's not on CRAN). Maybe demo with some other dataset instead?

It's also problematic to have vignettes that take a long time to run, but it looks like others have thought about solutions to that problem: https://www.kloppenborg.ca/2021/06/long-running-vignettes/

Mike