amices / mice

Multivariate Imputation by Chained Equations
https://amices.org/mice/
GNU General Public License v2.0
444 stars 107 forks source link

Add ranger backend for `mice.impute.rf` #431

Closed prockenschaub closed 3 years ago

prockenschaub commented 3 years ago

Changes

As described in #264 , add a ranger backend for imputation via random forest. New implementation has been checked using the same tasks used in:

Doove, L. L., S. Van Buuren, and E. Dusseldorp. 2014. “Recursive Partitioning for Missing Data Imputation in the Presence of Interaction Effects.” Computational Statistics & Data Analysis 72: 92–104.

Comment

Since both the randomForest package (current default) and the ranger package fit the same model class and draw from it in the same way, I have opted to make the choice of backend a new parameter rfPackage in mice.impute.rf, which keeps randomForest as the default choice for backwards compatibility. If this is not wanted, the same implementation can easily be pulled out into its own mice.impute.ranger.

stefvanbuuren commented 3 years ago

Thanks a lot.

I would prefer the faster method (ranger) as default, and switch to randomForest only if needed for backward compatibility. In that way, future users will get the faster method without further ado.

Would you be able to make ranger the default?

prockenschaub commented 3 years ago

Changed the default to ranger and updated the help pages accordingly.