JWiley / brmsmargins

Average marginal estimates for Bayesian models fit with the brms package
https://joshuawiley.com/brmsmargins
Other
20 stars 2 forks source link

Error when setting newdata argument in brmsmargins #1

Closed ajnafa closed 2 years ago

ajnafa commented 2 years ago

I'm trying to estimate average marginal effects for a societal growth curve model (see Fairbrother 2014) and I've run into an error when specifying the newdata argument. The model is as follows

# Specify the formula for the unconditional linear varying slopes model
sgc_uncond_linvar <- bf(
  churchill ~ time + (1 + time | country_jj) + (1 | survey_tt),
  family = bernoulli(link = "logit"),
  decomp = "QR"
)

# Specify Priors for the model
sgc_uncond_linvar_priors <-
  prior(student_t(3, 0, 2.5), class = "Intercept") +
  prior(normal(0, 2), class = "b") +
  prior(exponential(0.8), class = "sd", group = "survey_tt") +
  prior(exponential(0.8), class = "sd", group = "country_jj") +
  prior(lkj(4), class = "cor")

# Fit the model (6 chains, 3k iterations), takes ~5 hours
sgc_uncond_linvar_fit <- brm(
  formula = sgc_uncond_linvar,
  prior = sgc_uncond_linvar_priors,
  data = model_df,
  cores = 12,
  chains = 6,
  iter = 3000,
  warmup = 1500,
  refresh = 10,
  sample_prior = "yes",
  threads = threading(
    threads = 2,
    grainsize = 500
  ),
  save_pars = save_pars(all = TRUE),
  seed = 666,
  backend = "cmdstanr",
  save_model = str_c(stan_dir, "SGC_HLogit_LV_Unconditional.stan"),
  file = str_c(models_dir, "SGC_HLogit_LV_Unconditional")
)

I then try to estimate an AME for time based on the syntax in the vignette

h <- .001
ames <- brmsmargins(
  object = sgc_uncond_linvar_fit,
  newdata = model.frame(sgc_uncond_linvar_fit) %>% 
    distinct(churchill, time, country_jj, survey_tt),
  add = data.frame(time = c(0, h)),
  contrasts = cbind("AME time" = c(-1 / h, 1 / h)),
  effects = "integrateoutRE", 
  k = 100L, 
  seed = 1234
  )

which fails with the following error

Error in `[.data.table`(data, , ..n) : column(s) not found: L_2[1,1]

Since the time slope only varies across countries there is no L_2 parameter in the model but I'm not sure I understand why its looking for it in the first place?

JWiley commented 2 years ago

@ajnafa It is probably not the most efficient approach, but the code was setup to use the Cholesky decomposition of the random effect correlation matrix (Cholesky decomp matrices classically called L) in all cases. This (understandably) does not exist when there is a single random effect in a block --- I forgot to fill in 1s in that case. I think a fix has just been pushed. Would you test now and let me know how it goes?

ajnafa commented 2 years ago

From what I can tell that seems to have fixed the issue and I'm no getting the error. Thanks!

JWiley commented 2 years ago

Great, I'll close the issue for now. Feel free to reopen if it turns out not to be resolved. Separately, I'd also be interested for any other feedback or missing functionality. I cannot promise to implement, but I'd like a sense of what sorts of uses people have for this and how it can be made better. I'm currently working on it to fill a gap for a few of my own projects, but that only gives a narrow perspective.