ChristosPol commented 2 years ago

Hi all, I am writing here as we are experiencing some unexpected behavior of MICE in our simulations.

The setup goes as follows:

I have a composite score that is consisting of 4 components. That score is simply the average of the 4 components. Our goal is to multiply impute the missing component values in order to have an unbiased and precise estimate for the score. Our simulation’s estimand is the expected value of this score.

In our project, we draw nsim random samples from the original fully observed dataset. Once done that, I introduce missing values in a monotonic missing pattern. Simply, if there is a missing entry, its missing from all components(check code how it is done):

The underlying mechanism is MCAR as these missing values are introduced completely at random.

Once, we have the imputed datasets, we use the pooling process to get our estimates.

The code below is a reproducible example with the iris dataset. In reality the score I am referring to is a patient reported disease activity score consisting of 6 questions.

rm(list=ls())

library(mice)
library(data.table)
set.seed(1500)

# Dataset to use
data(iris);setDT(iris)

# Remove Species. Not needed
iris[, Species := NULL]

# Calculation of the score
iris[, Score := rowMeans(iris)]

# Number of simulated datasets
nsim <- 2000

# Size of simulated datasets to be drawn from the original complete iris df
df_size <- 100

# Percent of missingness
n_miss <- 70

# Lets draw the indeces from the original complete dataset to be used
samples <- list()
for (i in 1:nsim){
  samples[[i]] <- sample(1:nrow(iris),
                         df_size,
                         replace = TRUE)
}

# Realize the dataframes
frames <- lapply(X = samples, function(x){ iris[x, ]})

# Introduce missing entries
samples_NA <- list()
for (i in 1:nsim){
  tmp <- copy(frames[[i]])
 tmp[sample(x = 1:nrow(tmp), n_miss), c("Sepal.Length", "Petal.Length", "Petal.Width")] <- NA
  tmp[, Score := NULL]
  samples_NA[[i]] <- tmp
}

# Function to use within the pool function of mice
calc_score <- function(x1, x2, x3, x4) {
  score <- (x1 + x2 + x3 + x4)/4
  fit <- lm(score ~ 1)
}

# Mice imputation, settings default
theta_i_imp <- c()
i <-1
for (i in 1:length(samples_NA)){

  tmp <- copy(samples_NA[[i]])
  # Default mice settings
  x_imp <- mice(tmp,
                printFlag = F)

  fit <- with(x_imp, calc_score(x1 = Sepal.Length,
                                x2 = Sepal.Width,
                                x3 = Petal.Length,
                                x4 = Petal.Width))
  pooled <- pool(fit)
  theta_i_imp[i] <- pooled$pooled$estimate
  print(i)
}

# Calculation of bias and mcse of bias according to Morris et al. (2019), also confidence intervals:
bias_imp <- sum(theta_i_imp-mean(iris$Score))/nsim
bias_imp_SE <- sqrt(sum((theta_i_imp - mean(theta_i_imp))^2) / (nsim*(nsim-1)))
high_imp <- bias_imp + 1.96*bias_imp_SE
low_imp <- bias_imp - 1.96*bias_imp_SE
print(data.frame(low_imp, bias_imp, high_imp))

0.05558585  [0.04816874 , 0.06300296]

Given the results above, I can refer that the estimate of the expected value of the score is biased. This is something unexpected, and we cannot understand it. Under MCAR, both complete cases analysis and multiple imputation should be unbiased.

I understand that the dataset size is small (100), and I set 70% of it missing, but it still should be unbiased. Possibly larger Cis but still unbiased. I also use the default settings for m and maxit in the mice function.

Obviously, we are missing something and we would appreciate it very much if you could give us some help.

Thank you and best, Chris

gerkovink commented 2 years ago

It seems that you are imputing the composition as just another variable. It would be correct to impute the deterministic relation via passive imputation. We have a vignette that details passive imputation

ChristosPol commented 2 years ago

@gerkovink Thank you for your comment.

I did try the passive imputation, I am still getting biased results. It must be something else I am missing.

theta_i_imp <- c()
for (i in 1:length(samples_NA)){

  data <- copy(samples_NA[[i]])
  data$score <- (data$Sepal.Length + data$Sepal.Width + data$Petal.Length + data$Petal.Width)/4
  meth <- make.method(data)
  meth["score"] <- "~I((Sepal.Length + Sepal.Width + Petal.Length + Petal.Width)/4)"
  pred <- make.predictorMatrix(data)
  pred[c("Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width"), "score"] <- 0
  imp.pas <- mice(data, meth = meth, pred = pred,
                  print = FALSE)
  fit <- with(imp.pas, (Sepal.Length + Sepal.Width + Petal.Length + Petal.Width)/4)
  theta_i_imp[i] <- mean(unlist(lapply(fit$analyses, mean)))
  print(i)
}

stefvanbuuren commented 2 years ago

Perhaps you could simplify the problem, and create missing data in only one instead of three variables at a time.

ChristosPol commented 2 years ago

@stefvanbuuren Thank you for your comment. In our simulations we cover a large range of missing "scenarios". We test different missing proportions with different missing number of variables and sample sizes. We rotate which variables are missing as well. See it as a fully factorial design. So we have cases where things are simpler, I just chose one case where I was able to reproduce the problem easily and consistently with the iris dataset.

Regarding your feedback above, why would it make a difference if there are missing values in one variable vs three ?

Thanks again, Christos

stefvanbuuren commented 2 years ago

Just want to make sure that maxit is high enough. It would be good to know whether the problem also occurs with one incomplete variable and with maxit = 1. If things are fine in that case, then you might have a convergence issue.

ChristosPol commented 2 years ago

Hi @stefvanbuuren
Same code as above with maxit = 1 (3 variables missing). 0.0323509 [0.02528761, 0.03941419].

If I set only Sepal.Length to NA and maxit = 1. (1 variable missing) -0.0035049 [-0.006931213, -0.00007858704]. So still something is weird.

When I tried only Petal.Width the estimate came unbiased.

I have tried increasing m and maxit and was not helpful with my current issue. Inspecting visually I don't see a problem with convergence as well.

Thanks, Christos

stefvanbuuren commented 2 years ago

The mean of Petal.width in the complete data is 1.2, so the percent bias (PB) is equal to 100*0.0035/1.2 = 0.3%. One rule of thumb is that we speak of bias when PB > 5%. Here it is clearly lower.

http://stefvanbuuren.name/fimd/sec-evaluation.html#sec:evaluationcriteria

ChristosPol commented 2 years ago

@stefvanbuuren I see, thank you for the reference and reply. Indeed the magnitude of bias is extremely small in this case. For us is very important to understand why and what could in theory explain this behaviour. Especially if you want to trust your analysis and want to publish some findings.

MICE vs Complete case analysis should yield very similar results when we resample thousands of times completely at random in a controlled simulation setting. This was the hypothesis that we had even before starting coding. Any additional help understanding this would be greatly appreciated! Thanks again, Christos

stefvanbuuren commented 2 years ago

I guess it's a combination of a small sample, predictive mean matching and a skewed distribution that causes the (tiny) bias. Tim Morris and Florian Meinfelder made detailed studies of PMM for regions with sparse data.

Life's not perfect, and I would be happy with a bias of this size. I would be much more worried if we didn't find proper coverage of the confidence interval. Did you look at that also?

ChristosPol commented 2 years ago

Hi @stefvanbuuren , thank you I will have a look on the study you mentioned. Yes, we look into coverage as well as empirical variance and model variance. We noticed some weird results also there.

Running our simulation 2000 times, with passive imputation and only 1 variable missing (Sepal.Length) and default settings I have:

Bias

Empirical variance

Looks ok, as expected empirical standard error of MICE is larger than the true variance of the complete dataset (red line, analytically calculated)

Model variance

Now this is a bit problematic. Mice's model variance seems to be underestimating the true variance. That could indicate that we might have problems with coverage. I would expect higher model variance than the true, resulting in Relative % error in ModSE around 0%.

Coverage

All performance metrics are calculated based on Morris et al (2019)

Do you have any input on the model variance and coverage issues? When we have more variables missing and trying to impute with MICE these problems are getting amplified.

Thanks again, Christos

stefvanbuuren commented 2 years ago

I suspect coverage will improve if you replace z = 1.96 for infinite samples by the t-distribution.

amices / mice

Biased results under MCAR using MICE #469

Bias

Empirical variance

Model variance

Coverage