Open stefvanbuuren opened 1 year ago
Cannot reproduce with mice 3.16.8
nh3 <- mice::nhanes2
nh3$chl <- as.character(nh3$chl)
imp <- mice::mice(nh3)
#>
#> iter imp variable
#> 1 1 bmi hyp
#> 1 2 bmi hyp
#> 1 3 bmi hyp
#> 1 4 bmi hyp
#> 1 5 bmi hyp
#> 2 1 bmi hyp
#> 2 2 bmi hyp
#> 2 3 bmi hyp
#> 2 4 bmi hyp
#> 2 5 bmi hyp
#> 3 1 bmi hyp
#> 3 2 bmi hyp
#> 3 3 bmi hyp
#> 3 4 bmi hyp
#> 3 5 bmi hyp
#> 4 1 bmi hyp
#> 4 2 bmi hyp
#> 4 3 bmi hyp
#> 4 4 bmi hyp
#> 4 5 bmi hyp
#> 5 1 bmi hyp
#> 5 2 bmi hyp
#> 5 3 bmi hyp
#> 5 4 bmi hyp
#> 5 5 bmi hyp
#> Warning: Number of logged events: 1
imp$loggedEvents
#> it im dep meth out
#> 1 0 0 constant chl
Created on 2023-11-20 with reprex v2.0.2
Ah, thanks. I forgot to mention that my test was calculated from the branch support_blocks
branch.
I will add a test to that branch to ban this baby from appearing in master
.
Test added to mice4 branch
I got a report that the error may also appear in the CRAN version, mice 3.16.0
. Here's an example and work-around.
library(mice)
library(dplyr)
packageVersion('mice') # 3.16.0
nh3 <- mice::nhanes2
# add column with a character variable
rin <- c("123456789", "123456788", "123456778", "123456678", "123455678",
"123456799", "123445689", "123445679", "123345689", "122345678",
"223456789", "223456788", "223456778", "223456678", "223455678",
"223456799", "223445689", "223445679", "223345689", "222345678",
"323456799", "323445689", "323445679", "323345689", "322345678")
nh3_data <- nh3 %>% cbind(rin)
# impute train data
imp <- mice(nh3_data, m = 3, seed = 22112)
# use mice.mids and the mids object imp on test data (I used the same data set, but suppose it is new test data)
imp_test <- mice.mids(imp, newdata = nh3_data, maxit = 1)
# If you're unlucky (BUT WHY??) you'll get: Error in colMeans(as.matrix(imp[[j]]), na.rm = TRUE) : 'x' must be numeric
# ad-hoc solution
nh3_data <- nh3_data %>% mutate(rin = as.numeric(rin),
chl = as.numeric(chl))
# the error seems to be caused by character variable, even complete ones that are not imputed
imp_test <- mice.mids(imp, newdata = nh3_data, maxit = 1)
When I run this in my system, everything is fine. However some users report a crash with Error in colMeans(as.matrix(imp[[j]]), na.rm = TRUE) : 'x' must be numeric
. It is not yet clear why behaviours across systems is inconsistent.
Describe the bug MICE crashes on an incomplete character variable
To Reproduce
Created on 2023-11-20 with reprex v2.0.2
Expected behavior
mice()
should not touch or impute character variables.