amices / mice

Multivariate Imputation by Chained Equations
https://amices.org/mice/
GNU General Public License v2.0
442 stars 107 forks source link

Mode of the m = n multiple imputations #304

Closed tswiebold closed 3 years ago

tswiebold commented 3 years ago

Hello,

I am new to using the function and I have a couple questions:

  1. I was wondering if taking the mode (for categorical)/mean or median (for continuous) of the multiple imputations for the variables with missing values is appropriate?

  2. In terms of computation power, is it more beneficial to maximize "m" or "maxit?" Ideally, I would like to make both as high as possible, but I am limited by my hardware.

Any help will be greatly appreciated!

Have a great day, tswiebold

gerkovink commented 3 years ago
  1. Summarizing the cells of the multiply imputed data sets into a single set defies the purpose of multiple imputation. Stef explains in detail as to why you should never do this in FIMD v2, paragraph 5.1.2.
  2. instead of optimising m or maxit, one could better put effort in a good imputation/nonresponse model. If the analysis/imputation/nonresponse models are congenial, even a moderate m and maxit would yield valid inferences in most scenarios. E.g. in my research I often find that with m=5 and maxit=10 the mice algorithm has reached a converged state that yields valid inferences; even for scenarios with large amounts of missingness (i.e. > 50%). Added to this; if your models are wrong, even a large m and maxit will never yield valid inferences.

That said; too few iterations may leave the algorithm at a state of non-convergence. The m draws from the posterior predictive distribution of the missing data may then be redundant. How many m redundant draws you would then need is not a logical question.

All the best,

Gerko

gerkovink commented 3 years ago

Don't get me wrong: a higher m or a larger maxit will never hurt. But at a certain point, you're not getting better inferences. You're then using your computer as a room heater.

tswiebold commented 3 years ago

@gerkovink Thank you very much for your response! I will have to read over the source you provided!

gerkovink commented 3 years ago

Have a look at the mice vignettes. They may be helpful when you are starting out with the mice algorithm.