amices / mice

Multivariate Imputation by Chained Equations
https://amices.org/mice/
GNU General Public License v2.0
446 stars 108 forks source link

Defaults when imputing single variable #309

Closed edbonneville closed 3 years ago

edbonneville commented 3 years ago

Dear mice team,

I have noticed when using mice() on data where a single variable has missing values that the default maxit = 5 is still used, despite no cycles being needed since it is univariate imputation. For example:

library(mice, warn.conflicts = FALSE)

# Keep bmi as single variable with missings
dat <- subset(x = mice::nhanes2, select = c("age", "bmi"))
imps <- mice(data = dat, printFlag = FALSE)
summary(imps)
#> Class: mids
#> Number of multiple imputations:  5 
#> Imputation methods:
#>   age   bmi 
#>    "" "pmm" 
#> PredictorMatrix:
#>     age bmi
#> age   0   1
#> bmi   1   0
imps[["iteration"]]
#> [1] 5

I was wondering in this case whether it could be worth including either a message/warning for when maxit > 1, or changing to maxit = 1 internally? So only done when predictorMatrix has only a single row with non-zero values.

Giving a message is the approach taken in the Stata implementation of mice, see the relevant JSS article (section 6.2) which prints: "Only 1 variable to be imputed, therefore no cycling needed"

I think it could be an informative message for those new to imputation (and save a little computation time), but also understand if you choose to leave as is.

Thanks!

stefvanbuuren commented 3 years ago

Thanks for your suggestion. More generally, mice does not need to iterate when imputing a monotone missing data patterns.

A complication is that a user-specified where argument to the mice() function could break the monotone pattern, even if the missing data occur in only one variable. Also, a user-specified block argument complicates checking. So, while I am sympathetic to the idea, checking the conditions when exactly the message should be printed is less trivial that it appears. Since "iterating too much" does not hurt in any way (apart from a slight increase of calculation time), I will leave it as-is.

edbonneville commented 3 years ago

Understandable - thanks for considering.