amices / mice

Multivariate Imputation by Chained Equations
https://amices.org/mice/
GNU General Public License v2.0
424 stars 106 forks source link

`method` argument of `mice` ignores variable names? #632

Closed santikka closed 3 months ago

santikka commented 3 months ago

Perhaps this is intended behavior, but the method argument seems to ignore its names. Because predictorMatrix uses named columns and rows, I would expect that method could also be a named vector and the column order in the data could be safely ignored. Apparently this is not the case:

set.seed(0)
library("mice")

n <- 100
d <- data.frame(
  x = rnorm(n), 
  y = rnorm(n), 
  z = rnorm(n)
)
d$y[sample(n, size = n %/% 2)] <- NA
d$x[sample(n, size = n %/% 2)] <- NA
pred_mat <- matrix(
  c(0, 0, 0, 0, 0, 0, 1, 1, 0),
  3, 3,
  dimnames = list(c("x", "y", "z"), c("x", "y", "z"))
)

method <- c(x = "norm", y = "norm", z = "")
out <- mice(d, m = 1, predictorMatrix = pred_mat, method = method)
#> 
#>  iter imp variable
#>   1   1  x  y
#>   2   1  x  y
#>   3   1  x  y
#>   4   1  x  y
#>   5   1  x  y

# y gets imputed
sum(is.na(complete(out, 1)$y))
#> [1] 0

method2 <- c(x = "norm", z = "", y = "norm")
out2 <- mice(d, m = 1, predictorMatrix = pred_mat, method = method2)
#> 
#>  iter imp variable
#>   1   1  x
#>   2   1  x
#>   3   1  x
#>   4   1  x
#>   5   1  x

# y does not get imputed
sum(is.na(complete(out2, 1)$y))
#> [1] 50

Created on 2024-03-27 with reprex v2.1.0

stefvanbuuren commented 3 months ago

This is not a bug. The documentation of the mice() function states:

"For the j'th column, mice() calls the first occurrence of paste('mice.impute.', method[j], sep = '') in the search path."

Thus, changing the order changes the imputation model.

santikka commented 3 months ago

@stefvanbuuren thank you for the clarification.