amices / mice

Multivariate Imputation by Chained Equations
https://amices.org/mice/
GNU General Public License v2.0
428 stars 107 forks source link

Can't access additional packages in futuremice #544

Closed tarensanders closed 1 year ago

tarensanders commented 1 year ago

If you are trying to use mice with a method provided by an add-on package (e.g., "2l.pmm" from the {miceadds} package), the methods don't seem to be available in parallel. That is, they aren't passed into future_map, and so aren't available on the workers.

Here's an example (modified from the miceadds docs.

library(mice)
#> 
#> Attaching package: 'mice'
#> The following object is masked from 'package:stats':
#> 
#>     filter
#> The following objects are masked from 'package:base':
#> 
#>     cbind, rbind
library(miceadds)
#> * miceadds 3.16-18 (2023-01-06 10:54:00)
set.seed(976)
G <- 30        # number of groups
n <- 8        # number of persons per group
iccx <- .2    # intra-class correlation X
iccy <- .3    # latent intra-class correlation binary outcome
bx <- .4    # regression coefficient
threshy <- stats::qnorm(.70)  # threshold for y
x <- rep( rnorm( G, sd=sqrt( iccx) ), each=n )  +
            rnorm(G*n, sd=sqrt( 1 - iccx) )
y <- bx * x + rep( rnorm( G, sd=sqrt( iccy) ), each=n )  +
                rnorm(G*n, sd=sqrt( 1 - iccy) )
y <- 1 * ( y > threshy )
dat <- data.frame( group=100+rep(1:G, each=n), x=x, y=y )

#* create some missings
dat1 <- dat
dat1[ seq( 1, G*n, 3 ),"y" ]  <- NA
dat1[ dat1$group==2, "y" ] <- NA
vars <- colnames(dat1)
V <- length(vars)
#* predictor matrix
predmat <- matrix( 0, nrow=V, ncol=V)
rownames(predmat) <- colnames(predmat) <- vars
predmat["y", ] <- c(-2,2,0)
#* imputation methods
impmeth <- rep("",V)
names(impmeth) <- vars
#** imputation with predictive mean matching ('2l.pmm')
impmeth["y"] <- "2l.pmm"

imp1 <- mice::mice(
  data = as.matrix(dat1), method = impmeth,
  predictorMatrix = predmat, maxit = 1, m = 5
)
#> 
#>  iter imp variable
#>   1   1  y
#> Loading required namespace: lme4
#> boundary (singular) fit: see help('isSingular')
#> 
#>   1   2  y
#> boundary (singular) fit: see help('isSingular')
#> 
#>   1   3  y
#> boundary (singular) fit: see help('isSingular')
#> 
#>   1   4  y
#> boundary (singular) fit: see help('isSingular')
#> 
#>   1   5  y
#> boundary (singular) fit: see help('isSingular')

imp2  <- mice::futuremice(
   data = as.matrix(dat1), method = impmeth,
  predictorMatrix = predmat, maxit = 1, m = 5, n.core = 2
)
#> Error:
#> ℹ In index: 1.
#> Caused by error in `get()`:
#> ! object 'mice.impute.2l.pmm' not found

#> Backtrace:
#>      ▆
#>   1. ├─parallel (local) workRSOCK()
#>   2. │ └─parallel:::workLoop(...)
#>   3. │   └─parallel:::workCommand(master)
#>   4. │     ├─base::tryCatch(...)
#>   5. │     │ └─base (local) tryCatchList(expr, classes, parentenv, handlers)
#>   6. │     │   └─base (local) tryCatchOne(expr, names, parentenv, handlers[[1L]])
#>   7. │     │     └─base (local) doTryCatch(return(expr), name, parentenv, handler)
#>   8. │     ├─base::tryCatch(...)
#>   9. │     │ └─base (local) tryCatchList(expr, classes, parentenv, handlers)
#>  10. │     │   └─base (local) tryCatchOne(expr, names, parentenv, handlers[[1L]])
#>  11. │     │     └─base (local) doTryCatch(return(expr), name, parentenv, handler)
#>  12. │     ├─base::do.call(msg$data$fun, msg$data$args, quote = TRUE)
#>  13. │     └─future (local) `<fn>`(...)
#>  14. │       └─base::eval(expr, envir = envir, enclos = enclos)
#>  15. │         └─base::eval(expr, envir = envir, enclos = enclos)
#>  16. ├─base::tryCatch(...)
#>  17. │ └─base (local) tryCatchList(expr, classes, parentenv, handlers)
#>  18. │   └─base (local) tryCatchOne(expr, names, parentenv, handlers[[1L]])
#>  19. │     └─base (local) doTryCatch(return(expr), name, parentenv, handler)
#>  20. ├─base::withCallingHandlers(...)
#>  21. ├─base::withVisible(...)
#>  22. ├─base::local(...)
#>  23. │ └─base::eval.parent(substitute(eval(quote(expr), envir)))
#>  24. │   └─base::eval(expr, p)
#>  25. │     └─base::eval(expr, p)
#>  26. ├─base::eval(...)
#>  27. │ └─base::eval(...)
#>  28. │   ├─base::withCallingHandlers(...)
#>  29. │   ├─base::do.call(...furrr_map_fn, args)
#>  30. │   └─purrr (local) `<fn>`(.x = 3L, .f = `<fn>`)
#>  31. │     └─purrr:::map_("list", .x, .f, ..., .progress = .progress)
#>  32. │       ├─purrr:::with_indexed_errors(...)
#>  33. │       │ └─base::withCallingHandlers(...)
#>  34. │       ├─purrr:::call_with_cleanup(...)
#>  35. │       └─.f(.x[[i]], ...)
#>  36. │         └─mice (local) ...furrr_fn(...)
#>  37. │           └─mice::mice(...)
#>  38. │             └─mice:::sampler(...)
#>  39. │               └─mice:::handles.format(paste0("mice.impute.", theMethod))
#>  40. │                 └─base::get(fn)
#>  41. └─base::.handleSimpleError(...)
#>  42.   └─purrr (local) h(simpleError(msg, call))
#>  43.     └─cli::cli_abort(...)
#>  44.       └─rlang::abort(...)

Created on 2023-03-30 with reprex v2.0.2

As you can see, imp1 is fine, but imp2 using futuremice can't find the method.

The solution would seem to be to just pass in additional packages here: https://github.com/amices/mice/blob/3e3e3ca0fa53f1b90fb7142bedf36375d5282e90/R/futuremice.R#L162

Happy to make a PR with this change.

thomvolker commented 1 year ago

Thanks for posting the issue and providing a (probably the) solution!

This is a known issue (related to #529), because this also happens for user-defined imputation functions, we are working for a slightly more generic solution (probably this in combination with a globals argument in future_map()), but some verification and checks need to be done still.

tarensanders commented 1 year ago

Ah you're right that just passing in packages wouldn't help for user-defined functions. Let me know if there's a way you'd like me to contribute.

stefvanbuuren commented 1 year ago

Closing because of #550