Closed sylvaticus closed 2 years ago
Hi @sylvaticus, what happens when you re-run your code after installing randomForest
and/or ranger
? Depending on your version of mice
, the random forest engine is from either of these packages.
I installed randomForest (> packageVersion('randomForest') #[1] ‘4.6.14’
) but still with that problem (after restarting R).
I then removed ranger
and when trying again I get:
iter imp variable
1 1 V1Package ranger needed. Install from CRAN? (Yes/no/cancel)
And then it install https://cloud.r-project.org/src/contrib/ranger_0.13.1.tar.gz and compile it(I'm on Linux) but I am still with that problem:
** R
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded from temporary location
** checking absolute paths in shared objects and dynamic libraries
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (ranger)
The downloaded source packages are in
‘/tmp/Rtmp4A4q3H/downloaded_packages’
Error in nodes_mis[, i] : incorrect number of dimensions
>
is there a testPackage("ranger") sort of function in R ? I did try the example on the homepage of the ranger package and it works great..
I wasn't able to create the error with mice version 3.13, but after updating to 3.14 I got it too! @stefvanbuuren do you know what might be the problem? Could it have to do with mice:::install.on.demand()
?
Version 3.14.0 changes the default package for method rf
from "randomForest" to "ranger" (#431). It seems that there is an integration issue with "ranger" that we haven't discovered earlier.
My reprex yields:
library(mice, warn.conflicts = FALSE)
data <- matrix(c(1.0, 10.5, 1.5, 13.2, 1.8, 8.0, 1.7, 15.0, 23.0, 40.0,
2.0, 21.0, 3.3, 38.0, 4.5, -2.3, NA, -2.4),
nrow = 9, ncol = 2, byrow = TRUE)
df <- data.frame(data)
# In 3.14, ranger is the default
mice.impute.rf(y = df$X1, ry = !is.na(df$X1), x = df[, "X2", drop = FALSE],
rfPackage = "ranger")
#> Error in nodes_mis[, i]: incorrect number of dimensions
# The "old" randomForest still works
mice.impute.rf(y = df$X1, ry = !is.na(df$X1), x = df[, "X2", drop = FALSE],
rfPackage = "randomForest")
#> [1] 1.5
Created on 2021-11-29 by the reprex package (v2.0.1)
As a temporary fallback, add the rfPackage
argument as mice(..., rfPackage = "randomForest")
.
@prockenschaub Could you have a look at what might cause the problem, and perhaps add a test file?
The problem arises when there is only a single missing value. In my original code, I didn't account for R's automatic conversion to vector when selecting a single row of a matrix. I submitted a pull request #448 that fixes this behaviour.
@prockenschaub Thanks a lot. Yes, I know this glitch too well... :-)
mice 3.14.2
solves the problem.
@sylvaticus Thanks for reporting @hanneoberman @prockenschaub Thanks for solving
When I try the random forest imputation I have an
Error in nodes_mis[, i] : incorrect number of dimensions
error. This doesn't happen if I use other imputation methods or if I change slightly the matrix with the input data. Also, which package is mice using for actual randomForest prediction ? The documentation saysrandoomForest
, butrandomForest
is not installed (but I remember it did ask me to install a package when I first tried themeth='rf'
....).