suiji / Arborist

Scalable decision tree training and inference.
Other
82 stars 14 forks source link

Rborist crashing when x has column names #44

Closed rafalab closed 5 years ago

rafalab commented 5 years ago

I want to use the caret package to train a random forrest. It requires the predictor matrix to have column names. But Rborist is crashing when this matrix has column names. I am pretty sure it used to work.

library(Rborist)
Rborist 0.1-17
Type RboristNews() to see new features/changes/bug fixes.
x <- matrix(rnorm(1000*2), 1000, 2)
y <- factor(sample(c(0,1), 1000, 2)) 
fit <- Rborist(x, y) ##gives no error
colnames(x) <- c("a", "b")
fit <- Rborist(x, y) #gives error

Gives this error:

libc++abi.dylib: terminating with uncaught exception of type Rcpp::not_compatible: Not compatible with STRSXP: [type=NULL].
Abort trap: 6

The issue seems to come from PreFormat. The following crashes:

library(Rborist)
x <- matrix(rnorm(1000*2), 1000, 2)
colnames(x) <- c("a", "b")
PreFormat(x)

Here is the session Info

sessionInfo()
R version 3.5.3 (2019-03-11)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.4

Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

loaded via a namespace (and not attached):
[1] compiler_3.5.3
suiji commented 5 years ago

Thank you for distilling this into a simple test case. It did indeed "work" in the past, but likely by accident. A fix has been placed on Github.

Are you able to install a hot fix from Github? If not, would you like one sent to you, or do you prefer to wait until the repaired version appears on CRAN?

rafalab commented 5 years ago

I can install from GitHub. Thanks!