stekhoven / missForest

missForest is a nonparametric, mixed-type imputation method for basically any type of data for the statistical software R.
http://stat.ethz.ch/CRAN/web/packages/missForest/index.html
91 stars 24 forks source link

missForest::missForest cannot handle Inf values #35

Closed blueskypie closed 8 months ago

blueskypie commented 8 months ago

if xmis in missForest::missForest contains Inf, it crashes with the following error:

Error in randomForest.default(x = obsX, y = obsY, ntree = ntree, mtry = mtry, : NA/NaN/Inf in foreign function call (arg 1)

it'd be helpful to handle this case to avoid crash, e.g. change Inf to NA.

stekhoven commented 8 months ago

the crash is provoked by randomForest() not missForest() however having Inf in your data is generally not advisable as no regression method will be able to handle it (at least not without any tweaking/setting involved). Replacing it with NA will also not help, as this would either provoke the exact same error from randomForest() or break the imputation process of missForest.

Bottomline: avoid having Inf in your data.