Closed GabriellaS-K closed 3 years ago
I think brm_multiple
just expects a list of datasets, so you can basically go along the lines of the missRanger multiple imputation vignette on https://cran.r-project.org/web/packages/missRanger/vignettes/multiple_imputation.html
Let me know if the results look (un-)reasonable.
# Via mice
library(mice)
library(brms)
imp <- mice(nhanes, m = 5, print = FALSE)
fit_imp1 <- brm_multiple(bmi ~ age*chl, data = imp, chains = 2)
# With missRanger
library(missRanger)
# Generate 5 complete data sets
imp <- replicate(5, missRanger(nhanes, verbose = 0, num.trees = 50, pmm.k = 5),
simplify = FALSE)
# Fit model
fit_imp2 <- brm_multiple(bmi ~ age*chl, data = imp, chains = 2)
HI,
You so much for the answer, that's actually what I tried to do-my imputed dataset (called imputed) was fed straight into the bar and multiple just like you did in your example with fit_imp2. The model runs, the problem comes after-I'd like to compare different models together using the LOO function, but because it isn't pooled it only uses the first imputed dataset
Hmm. If you could adapt my examples (both mice and missRanger) accordingly, that would be fantastic.
I'm not sure what you mean by adapt your examples, sorry!!
I would need a fully reproducible example to see what works and what not.
Ah ok, great!
Please find below:
Here is a subset of my data:
structure(list(agequartiles = structure(c(1L, 3L, 2L, 1L, 2L,
4L, 3L, 1L, 3L, 4L, 1L, 2L, 2L, 2L, 4L, 1L, 3L, 3L, 4L, 4L, 4L,
3L, 4L, 1L, 4L, 3L, 1L, 4L, 2L, 1L, 1L, 2L, 2L, 1L, 2L, 3L, 2L,
2L, 3L, 4L, 4L, 3L, 2L, 3L, NA, 1L, 1L, 1L, 2L, 2L), .Label = c("[18,23]",
"(23,27]", "(27,32]", "(32,54]"), class = "factor"), sentiment = c(1,
1, 1, 1, 3, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 3, 1, 1, 1, 1,
1, 3, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 3, 1, 1, 1, 1, 1, 3, 2, 1,
1, 2, 1, 1, 3, 1, 3), group = structure(c(2L, 3L, 3L, 2L, 2L,
1L, 2L, 1L, 2L, 2L, 2L, 3L, 3L, 1L, 3L, 1L, 3L, 2L, 2L, 1L, 3L,
1L, 3L, 2L, 1L, 2L, 2L, 2L, 3L, 1L, 1L, 2L, 1L, 3L, 1L, 2L, 3L,
3L, 3L, 3L, 2L, 3L, 3L, 1L, 3L, 3L, 3L, 3L, 3L, 2L), .Label = c("prime1",
"prime2", "prime3"), class = "factor"), continent = c("UK", "Australia and New Zealand",
"Northern America", "UK", "Northern America", "Australia and New Zealand",
"Asia and the Pacific", "UK", "Southern and Central America",
"Australia and New Zealand", "UK", "Northern America", "Northern America",
"UK", "Northern America", "UK", "UK", "Northern America", "UK",
"Northern America", "Northern America", "Southern and Central America",
"Northern America", "UK", "Europe", "Northern America", "UK",
"Northern America", NA, "UK", "UK", "Australia and New Zealand",
"Australia and New Zealand", "UK", "UK", "UK", "Australia and New Zealand",
"Northern America", "UK", "Northern America", "UK", "Asia and the Pacific",
"Northern America", "Northern America", NA, NA, "UK", "Europe",
"UK", "Northern America"), ID = 1:50, medication = c("FALSE",
"FALSE", "FALSE", "FALSE", "FALSE", "FALSE", "FALSE", "TRUE",
"FALSE", "FALSE", "TRUE", "FALSE", "FALSE", "FALSE", "FALSE",
"FALSE", "FALSE", "TRUE", "TRUE", "FALSE", "FALSE", "FALSE",
"FALSE", "FALSE", "FALSE", "FALSE", "FALSE", "FALSE", "FALSE",
"FALSE", "TRUE", "FALSE", "FALSE", "TRUE", "TRUE", "FALSE", "FALSE",
"FALSE", "FALSE", "FALSE", "FALSE", "FALSE", "TRUE", "TRUE",
"FALSE", "FALSE", "FALSE", "TRUE", "FALSE", "TRUE")), row.names = c(NA,
50L), class = "data.frame")
Then I imputed:
library(missRanger)
data <- lapply(3456:3460, function(x)
missRanger(
data,
. #predict all columns
~ . #Make predictions using all columns except:
- ID,
maxiter = 10,# How many iterations until it stops?
pmm.k = 3, #Predictive Mean Matching leading to more natural imputations and improved distributional properties of the resulting values
verbose = 1,#how much info is printed to screen,
seed = x,#Integer seed to initialize the random generator.
num.trees = 200,
returnOOB = TRUE,
case.weights = NULL
)
)
Then I ran 5 models
models_group <- brm_multiple(formula = sentiment ~ 1 + cs(group), data = data, family = acat("cloglog"), combine=TRUE, chains=4)
models_meds <- brm_multiple(formula = sentiment ~ 1 + cs(group)+ medication, data = data, family = acat("cloglog"), combine=TRUE, chains=4)
models_age <- brm_multiple(formula = sentiment ~ 1 + cs(group)+age, data = data, family = acat("cloglog"), combine=TRUE, chains=4)
models_continent <- brm_multiple(formula = sentiment ~ 1 + cs(group)+continent, data = data, family = acat("cloglog"), combine=TRUE, chains=4)
models_all<-models_age <- brm_multiple(formula = sentiment ~ 1 + cs(group) +age +medication+continent, data = data, family = acat("cloglog"), combine=TRUE, chains=4)
And finally the LOO
modelcomparison<-loo(models_all, models_group, models_meds, model_continent, models_age)
Okay, thanks a lot for that example. I visited
My first thought:
combine = FALSE
in brm_multiple()
, thenbrm_multiple()
doing some Bayesian magic, thenloo
I would actually suggest to ask the brms team how they would approach the problem. I think it would be quite cool if loo
would work on the output of brm_multiple()
, independent of using missRanger
or another algo.
OK great thank you for that, I will do!
Hi,
Thank you for a brilliant package. I'm using missRanger to impute, and then apply BRMS to the imputed dataset. BRMS describes how to use the
mice
package, but missRanger imputed data comes out quite different.Ideally I would have imputed the data, pooled the data, run my models, run model comparisons. But I cannot then pool using mice, it doesn't work. So instead I run multiple models on imputed data like this:
models_imputed <- brm_multiple(formula = score ~ 1 + cs(group), data = imputed, family = acat("cloglog"), combine=TRUE, chains=1)
But this is pretty clunky, and if I try to do a LOO on my models (I have 5) I get the error:Using only the first imputed data set. Please interpret the results with caution until a more principled approach has been implemented.
This isn't an issue with missRanger as such, more that I'm caught in the space between missRanger and BRMS and am not sure how to get them to work together...hoping someone might have advice!
Thanks