amices / mice

Multivariate Imputation by Chained Equations
https://amices.org/mice/
GNU General Public License v2.0
447 stars 108 forks source link

ampute.discrete failing when input data set contains character/categorical variables #611

Closed imazubi closed 7 months ago

imazubi commented 11 months ago

Describe the bug

This is not a bug, but I feel it is quite an important enhancement.

After making quite a deep debugging:

In the ampute() function we have the following line of code data <- as.data.frame(sapply(data, as.numeric)) that takes care of converting the variables to numeric. If the variable is a character or factor (i.e treatment arm), these are converted to NA.

While making use of ampute.discrete, as I wanted to go over the odds parameter, (with cont = FALSE) the line scores <- apply(candidates, 1, function(x) weights[i, ] %*% x) is generating a vector of NA-s even when weight to be used for this variable is 0. The following screenshot shows the resulting scores output (length of 2 as I had two missing patterns).

image

Then, the ampute.discrete is throwing the following error which is hard to interpret unless you debug deeply in the mice functions.

Error in if (scores[[i]][[1]] == 0) { : missing value where TRUE/FALSE needed

What about adding an assertion to make sure there is no character or factor variable in the input data set when trying to use ampute.discrete?

If ampute.continuous() the condition else if (length(unique(scores.temp)) == 1) within this function is TRUE as all is NA as shown in the screenshot above. This gives the following warning, but the function does not throw an error:

warning(paste("The weighted sum scores of all candidates in pattern", i, "are the same, they will be amputed with probability", prop), call. = FALSE)
        probs <- prop

My suggestion would be to prevent the user from adding non-numerical variables to the mice::ampute function.

@stefvanbuuren

Thank you for working on this amazing package!

stefvanbuuren commented 7 months ago

I don't think we should prevent users from doing something, even if that something is an undocumented action. So let's continue to make errors, and learn from that.