Error in stats::model.frame when specifying an lm formula in a function argument, and boot = T

pinusm commented 3 years ago

This works fine with boot=F, but throws an error when boot=T. Seems related to #29

library(mediation)
#> mediation: Causal Mediation Analysis
#> Version: 4.5.0
test <- function(f1, f2, boot){
    mtcars$mpg <- mtcars$mpg/max(mtcars$mpg)
    m_med <- lm(f1, mtcars)
    m_out <- lm(f2, mtcars)
    mediate(m_med, m_out, treat = "wt", mediator = 'am', boot = boot)
}

x <- test('am ~ wt', 'mpg ~ wt + am', F)
#success!
x <- test('am ~ wt', 'mpg ~ wt + am', T)
#> Running nonparametric bootstrap
#> Error in stats::model.frame(formula = f1, data = structure(list(am = c(1, : object 'f1' not found

ianaianawong commented 1 year ago

Hi @pinusm, I had a similar issue. I wonder whether you have found any solutions? If yes, would you mind sharing with us?

ianaianawong commented 1 year ago

Maybe I'll explain my problem in detail below.

Looks like the two posts related to this problem have not been answered before, so I'm tagging the contributors to the mediation package to attract more attention. My apologies if this tagging is a spam to you. @mdtrinh @weihuangwong @kosukeimai @k-hirose @teppeiyamamoto @ck37 @mspeekenbrink @kevinbrosnan @davraam @christopherkenny @GregFa

I'm using the mediation package in r to analyze the mediating role of 5 mediators between 1 predictor and 1 outcome variable. All variables are continuous. When boot=FALSE, the coding below worked. However, when boot=TRUE, I got the error message below.

olsols <- mediations(datasets, treatment, mediators, outcome, families=c("gaussian","gaussian"), 
                         interaction=FALSE, conf.level=.90, sims=1000, boot=TRUE, boot.ci.type="bca")

Error in stats::model.frame(formula = f1, data = list(IRQ_original_total_mi = c(79,  : 
  object 'f1' not found

Can anyone please advise how can I solve the error?

Before using the mediations function, I did multiple imputations using mice package to handle the missing data.

I've tried to look for similar problems reported by previous researchers, but I'm afraid I can only find one relevant post (this post that I'm replying to). However, there weren't any responses to that post so I decided to create a new post here.

Here is part of the dataset: https://drive.google.com/file/d/11rv1iniNjw6RpgqnYHdhD3FLy_hZbKNi/view?usp=sharing

Here are the codes I used:

## create a dataframe for the multiple imputation ====
    alldata4 <- alldata3[ ,c('ECR_1', 'ECR_2', 'ECR_3', 'ECR_4', 'ECR_5', 'ECR_6',
                             'ECR_7', 'ECR_8', 'ECR_9', 'ECR_10', 'ECR_11', 'ECR_12',  
                             'ECR_13', 'ECR_14', 'ECR_15', 'ECR_16', 'ECR_17', 'ECR_18', 
                             'ECR_19', 'ECR_20', 'ECR_21', 'ECR_22', 'ECR_23', 'ECR_24', 
                             'ECR_25', 'ECR_26', 'ECR_27', 'ECR_28', 'ECR_29', 'ECR_30', 
                             'ECR_31', 'ECR_32', 'ECR_33', 'ECR_34', 'ECR_35', 'ECR_36', 
                             'IRQ_1', 'IRQ_2', 'IRQ_3', 'IRQ_4', 'IRQ_5', 'IRQ_6', 
                             'IRQ_7', 'IRQ_8', 'IRQ_9', 'IRQ_10', 'IRQ_11', 'IRQ_12', 
                             'IRQ_13', 'IRQ_14', 'IRQ_15', 'IRQ_16', 'IRQ_17', 'IRQ_18', 
                             'IRQ_19', 'IRQ_20', 'IRQ_21', 'IRQ_22', 'IRQ_23', 'IRQ_24', 
                             'IRQ_25', 'IRQ_26', 'IRQ_27', 'IRQ_28', 'IRQ_29', 'IRQ_30', 'IRQ_31',"IRQ_32",
                             'CEER_1', 'CEER_2', 'CEER_3', 'CEER_4','CEER_5','CEER_6',
                             'CEER_7','CEER_8','CEER_9','CEER_10','CEER_11','CEER_12',
                             'IRIS_1', 'IRIS_2', 'IRIS_3', 'IRIS_4','IRIS_5','IRIS_6',
                             'IRIS_7','IRIS_8','IRIS_9','IRIS_10','IRIS_11','IRIS_12',
                             'IRIS_13', 'IRIS_14', 'IRIS_15', 'IRIS_16', 'IRIS_17', 'IRIS_18', 
                             'IRIS_19', "IRIS_20", "IRIS_21", "IRIS_22", "IRIS_23", "IRIS_24", 
                             'IRIS_25', 'IRIS_26', 'IRIS_27', 'IRIS_28',
                             'DERS_1', 'DERS_2', 'DERS_3', 'DERS_4','DERS_5','DERS_6',
                             'DERS_7','DERS_8','DERS_9','DERS_10','DERS_11','DERS_12',
                             'DERS_13', 'DERS_14', 'DERS_15', 'DERS_16',
                             'CTS2_1_R', 'CTS2_2_R', 'CTS2_3_R', 'CTS2_4_R', 'CTS2_5_R', 'CTS2_6_R', 
                             'CTS2_7_R', 'CTS2_8_R', 'CTS2_9_R', 'CTS2_10_R', 'CTS2_11_R', 'CTS2_12_R', 
                             'CTS2_13_R', 'CTS2_14_R', 'CTS2_15_R', 'CTS2_16_R', 'CTS2_17_R', 'CTS2_18_R', 
                             'CTS2_19_R', 'CTS2_20_R', 'CTS2_21_R', 'CTS2_22_R', 'CTS2_23_R', 'CTS2_24_R', 
                             'CTS2_25_R', 'CTS2_26_R', 'CTS2_27_R', 'CTS2_28_R', 'CTS2_29_R', 'CTS2_30_R', 
                             'CTS2_31_R', 'CTS2_32_R', 'CTS2_33_R', 'CTS2_34_R', 'CTS2_35_R', 'CTS2_36_R', 
                             'CTS2_37_R', 'CTS2_38_R', 'CTS2_39_R', 'CTS2_40_R', 'CTS2_41_R', 'CTS2_42_R', 
                             'CTS2_43_R', 'CTS2_44_R', 'CTS2_45_R', 'CTS2_46_R', 'CTS2_47_R', 'CTS2_48_R', 
                             'CTS2_49_R', 'CTS2_50_R', 'CTS2_51_R', 'CTS2_52_R', 'CTS2_53_R', 'CTS2_54_R',
                             'CARS_1_R','CARS_2_R', 'CARS_3_R', 'CARS_4_R', 'CARS_5_R', 'CARS_6_R', 
                             'CARS_7_R','CARS_8_R', 'CARS_9_R', 'CARS_10_R', 'CARS_11_R', 'CARS_12_R', 
                             'CARS_13_R','CARS_14_R', 'CARS_15_R', 'CARS_16_R','CARS_17_R','CARS_18_R', 
                             'CARS_19_R','CARS_20_R', 'CARS_21_R', 'CARS_22_R', 'CARS_23_R','CARS_24_R', 
                             'CARS_25_R','CARS_26_R', 'CARS_27_R', 'CARS_28_R', 'CARS_29_R','CARS_30_R', 
                             'CARS_31_R','CARS_32_R', 'CARS_33_R', 'CARS_34_R',  
                             'Relation_length_months', 'Relation_status', 'Income',
                             'Current_age', 'Gender', 'Aboriginal', 
                             'Commit_1', 'Commit_2', 'Commit_3','Commit_4', 'Commit_5','Commit_6','Commit_7',                           
                             'AUDIT_1','AUDIT_2', 'AUDIT_3', 'AUDIT_4', 'AUDIT_5', 'AUDIT_6',
                             'AUDIT_7', 'AUDIT_8', 'AUDIT_9',  'AUDIT_10')]

    ## multiple imputation 
       alldata4.mi <- mice::mice(alldata4, m = 5, method = 'pmm')

    ## obtain each dataset from the stacked dataset (the first one first)
        ECR.alldata4.mi.1 <- mice::complete(alldata4.mi, c(1))
        ECR.alldata4.mi.2 <- mice::complete(alldata4.mi, c(2))
        ECR.alldata4.mi.3 <- mice::complete(alldata4.mi, c(3))
        ECR.alldata4.mi.4 <- mice::complete(alldata4.mi, c(4))
        ECR.alldata4.mi.5 <- mice::complete(alldata4.mi, c(5))

## total score of ECR - anxiety after multiple imputation for each of the 5 imputed dataset 
ECR.alldata4.mi.1$ECR_anxiety_average_mi <- rowMeans(ECR.alldata4.mi.1[c("ECR_2", "ECR_4", "ECR_6", "ECR_8","ECR_10","ECR_12","ECR_14","ECR_16","ECR_18",
                                                                 "ECR_20","ECR_22","ECR_24","ECR_26","ECR_28","ECR_30","ECR_32","ECR_34","ECR_36")])

## total score of IRQ - original after multiple imputation for each of the 5 imputed dataset 
ECR.alldata4.mi.1$IRQ_original_total_mi <- rowSums(ECR.alldata4.mi.1[c("IRQ_1", "IRQ_3", "IRQ_5", "IRQ_7","IRQ_9","IRQ_11","IRQ_13","IRQ_15","IRQ_17",
                                                               "IRQ_19","IRQ_21","IRQ_23","IRQ_25","IRQ_27","IRQ_29","IRQ_31")])

#### I created the total scores of the variables included in the mediations code. I did not present all of them here to save space.

    # assign variables to their roles for the "mediations" function
    datasets <- list(ECR.alldata4.mi.1=ECR.alldata4.mi.1, ECR.alldata4.mi.2=ECR.alldata4.mi.2, 
                     ECR.alldata4.mi.3=ECR.alldata4.mi.3, 
                     ECR.alldata4.mi.4=ECR.alldata4.mi.4, ECR.alldata4.mi.5=ECR.alldata4.mi.5) # list of multiply imputed data sets

    mediators <- c("IRQ_original_total_mi", "IRQ_partner_total_mi", "CEER_total_mi", "IRIS_total_mi", "DERS_total_mi")
    outcome <- c("CTS2_psy_perp_total_mi")
    treatment <- c("ECR_anxiety_average_mi","ECR_anxiety_average_mi","ECR_anxiety_average_mi","ECR_anxiety_average_mi","ECR_anxiety_average_mi")  # note how the treatment indicator is repeated to match the number of datasets

    # mediation analysis
    olsols <- mediations(datasets, treatment, mediators, outcome, families=c("gaussian","gaussian"), 
                         interaction=FALSE, conf.level=.90, sims=1000, boot=TRUE, boot.ci.type="bca")

pinusm commented 1 year ago

I'm sorry.. It's been a long time and I don't remember the context in which I encountered the problem, so I don't know where to look for to see if I've found a solution/workaround..

ianaianawong commented 1 year ago

Hi @pinusm, no problem. Thank you for taking the time to let me know :)

grasshoppermouse commented 1 year ago

Same as pinusm. I searched my computer to see if I could find anything, but came up empty.

ianaianawong commented 1 year ago

Hi @grasshoppermouse ,no problem. Thank you for letting me know.

christopherkenny commented 1 year ago

These looks like different problems. @pinusm's issue is about environment scoping, as lm stores the call as-is, which will have the formula as "f1", but f1 will not be defined in mediation's environment. That's a function of how stats::lm works. You can get around that with the usual evaluation trick with do.call.

library(mediation)
#> Loading required package: MASS
#> Loading required package: Matrix
#> Loading required package: mvtnorm
#> Loading required package: sandwich
#> mediation: Causal Mediation Analysis
#> Version: 4.5.0
test <- function(f1, f2, boot){
  mtcars$mpg <- mtcars$mpg/max(mtcars$mpg)
  m_med <- do.call(what = 'lm', list(formula = f1, data = mtcars))
  m_out <- do.call(what = 'lm', list(formula = f2, data = mtcars))
  mediate(m_med, m_out, treat = "wt", mediator = 'am', boot = boot)
}

x <- test('am ~ wt', 'mpg ~ wt + am', FALSE)
summary(x)
#> 
#> Causal Mediation Analysis 
#> 
#> Quasi-Bayesian Confidence Intervals
#> 
#>                Estimate 95% CI Lower 95% CI Upper p-value    
#> ACME            0.00111     -0.03294         0.04    0.94    
#> ADE            -0.15939     -0.20537        -0.12  <2e-16 ***
#> Total Effect   -0.15828     -0.18978        -0.13  <2e-16 ***
#> Prop. Mediated -0.00738     -0.24726         0.20    0.94    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Sample Size Used: 32 
#> 
#> 
#> Simulations: 1000
x <- test('am ~ wt', 'mpg ~ wt + am', TRUE)
#> Running nonparametric bootstrap
summary(x)
#> 
#> Causal Mediation Analysis 
#> 
#> Nonparametric Bootstrap Confidence Intervals with the Percentile Method
#> 
#>                 Estimate 95% CI Lower 95% CI Upper p-value    
#> ACME            0.000246    -0.033198         0.04    0.92    
#> ADE            -0.157900    -0.231674        -0.11  <2e-16 ***
#> Total Effect   -0.157654    -0.203822        -0.12  <2e-16 ***
#> Prop. Mediated -0.001560    -0.212690         0.20    0.92    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Sample Size Used: 32 
#> 
#> 
#> Simulations: 1000

^{Created on 2023-08-03 with reprex v2.0.2}

ianaianawong commented 1 year ago

Hi @christopherkenny, thanks a lot for your help. I'll try out the codes later.

wJDKnight commented 5 months ago

I have the same issue. When boot=TRUE, mediate() will fail with `Running nonparametric bootstrap for ordered outcome

Error in eval(mf, parent.frame()) : object 'f1' not found`

kosukeimai / mediation

Error in stats::model.frame when specifying an lm formula in a function argument, and boot = T #32