Open rduesing opened 1 year ago
Added this to the FAQ, didn't try to solve it yet.
Hi, I faced a similar issue with L2 level sampling. Sometimes the CIs are returned as NAs or impossible CIs (both bounds greater than the estimate) when bootstrapping. I wrote a custom wrapper for L2 bootstrapping which resamples at L2 level and calls the inner modmed.mlm function. I am not sure if it is because the estimation returns NA on one of the repetitions / clusters (because this is what caused my wrapper to return NAs during testing).
I am on a Windows machine using RStudio if this information is useful. Also below is the code I used for manually bootstrapping if anyone needs it in the meantime.
parallel_bootstrap<-function(seed,mod.med.data,bootstraps, X,Y,M, moderator,
mod.a =TRUE, mod.b =FALSE, mod.cprime = TRUE,
random.a = FALSE,random.b = FALSE,random.cprime = FALSE,case = "L2"){
ngroup <- length(unique(mod.med.data$id))
unique_ids <- unique(mod.med.data$id)
# pick ngroup Level 2 units from the sample with replacement
set.seed(seed)
sampled_ids <- sample(unique_ids,ngroup, replace = TRUE)
bootstrapped_df <- data.frame()
for (i in 1:length(sampled_ids)){
temp <-mod.med.data[mod.med.data$id==sampled_ids[i],]
# SAMPLE WITHIN EACH ID WITH REPLACEMENT
if(case == "both"){
print(case)
set.seed(seed)
temp <- temp[sample(nrow(temp), nrow(temp), replace = TRUE),]
}
#
bootstrapped_df <-rbind(bootstrapped_df, temp)
}
# SmX * SyM --> indirect value when modval =0
# SmX * SyM +SmX:W*SyM --> indirect when modval =1
res1<-modmed.mlm(data = bootstrapped_df,
L2ID = "id",
X = X,
Y = Y,
M = M,
moderator = moderator,
mod.a = mod.a, mod.b = mod.b, mod.cprime = mod.cprime,
random.a = random.a, random.b = random.b, random.cprime = random.cprime)
a1 = extract.modmed.mlm(res1,type = "a",modval1 = 1)
a0 = extract.modmed.mlm(res1,type = "a",modval1 = 0)
b= extract.modmed.mlm(res1,type = "b")
indirect_1 = a1*b
indirect_0 = a0*b
indirect_diff = indirect_1 - indirect_0
data.frame(indirect_0,indirect_1,indirect_diff)
}
custom_boostrap_parallel <-function(mod.med.data,bootstraps, X,Y,M, moderator,
mod.a =TRUE, mod.b =FALSE, mod.cprime = TRUE,
random.a = FALSE,random.b = FALSE,random.cprime = FALSE){
finaldf <- foreach(j=1:bootstraps, .combine=rbind,
.export = c("parallel_bootstrap"),
.packages = c("multilevelmediation")) %dopar% {
tempdf = parallel_bootstrap(seed = j,
mod.med.data,bootstraps, X,Y,M, moderator,
mod.a =TRUE, mod.b =FALSE, mod.cprime = TRUE,
random.a = FALSE,random.b = FALSE,random.cprime = FALSE,case = "L2")
tempdf #Equivalent to finaldf = rbind(finaldf, tempdf)
}
}
Hi,
I came across a similar issue asthe one described by voidyman - though with residual bootstrapping. The corresponding functions
_model_h1h2 <- boot.modmed.mlm.custom(data_tibble, ..., L2ID="unit", ...., boot.type="resid" extract.boot.modmed.mlm(modelh1h2, type = "a")
yield CI values which do not include the indicated estimate nor are anywhere close to the estimate. Also happening when trying to convert tibble to a data.frame and when playing around with other function parameters or extracting other path estimates.
Sorry I have not yet had time to look into this issue. But, I have 2 (edit: now 3) comments:
Hello,
sorry for not having provided a MWE, I am rather new to using Github actively. Happy to provide some code and data - not simulated in the code since I could not properly simulate the hierarchical data structure - and the corresponding code yielding unexpected estimates given the 95%CI.
model_resid_50_jt <- boot.modmed.mlm.custom(df, nrep=50, L2ID="unit_id", X = "X_c", Y = "Y", M = "M_c",
random.a = TRUE, random.b = TRUE, random.cprime = TRUE,
boot.type="resid",
parallel.type="parallel",ncores=8,seed=2299,
control=list(opt="nlm")) # runs 192.24 seconds, with 50 reps just for testing
extract.boot.modmed.mlm(model_resid_50_jt, type = "a")
extract.boot.modmed.mlm(model_resid_50_jt, type = "cprime")
I am wondering if some data propensities have to do with it, but since I am new to MLM mediation, this really is a guess and I would be glad for a hint why the 95%CI may be so off.
Of course, no problem. I think my comment applies to the tibble issue as well.
Thanks for posting one now - it looks like that's a bug that should now be fixed if you are able to use the latest version from GibHub. (for the record: 485dd9e )
Perfect, it works! Thanks for solving this so quickly.
Hello!
Many thanks for the package and the simulation article. I tried to estimate simulated data with your package and encountered a strange beahvior, when I stored the data as tibble, as compared to a data frame. The main analyses went fine, but when entered into the L2 bootstrap function, upper and lower CI values are identical (for all estimates), and are quite unreasonable (and the bootstraping ends finishes after a very short time, as if no real bootstrapping has been conducted). When I store the same data as data frame, the bootstrap behaves normally and gives reasonable values.