grf-labs / grf

Generalized Random Forests
https://grf-labs.github.io/grf/
GNU General Public License v3.0
974 stars 250 forks source link

Error in t.default(T) : argument is not a matrix #309

Closed ZhangMengxia closed 6 years ago

ZhangMengxia commented 6 years ago

Dear grf developers,

I keep get the following error: Error in t.default(T) : argument is not a matrix after I run: tau.forest.centered = causal_forest(X, Y_centered, W_centered,tune.parameters = TRUE) I checked all the inputs X, Y_centered, W_centered. They all look normal and do not have NaN. The dimension of X is 57914 by 48. Some columns of X are continuous real numbers. Some columns of X are binary dummy variables. Dimension of Y_centered is 57914 by 1. Dimension of W_centered is 57914 by 1. Y_centered and W_centered are dependent variable and treatment, respectively. And they are centered by Y_centered=Y-Y.hat and W_centered=W-W.hat. And Y.hat is computed using the following code, similarly for W.hat. forest.Y = regression_forest(X, Y, tune.parameters = TRUE) Y.hat = predict(forest.Y)$predictions

The following is the output for sessionInfo() function.

sessionInfo() R version 3.5.1 (2018-07-02) Platform: x86_64-apple-darwin15.6.0 (64-bit) Running under: OS X El Capitan 10.11.6

Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib

locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] survival_2.42-3 plm_1.6-6 Formula_1.2-3 grf_0.10.0

loaded via a namespace (and not attached): [1] bdsmatrix_1.3-3 Rcpp_0.12.18 lattice_0.20-35 zoo_1.8-3 lmtest_0.9-36 MASS_7.3-50 grid_3.5.1 nlme_3.1-137 miscTools_0.6-22 Matrix_1.2-14 sandwich_2.4-0
[12] splines_3.5.1 tools_3.5.1 compiler_3.5.1 DiceKriging_1.5.5 maxLik_1.3-4

Best regards, Mengxia.

ZhangMengxia commented 6 years ago

I still get the same error alert when I replace X by a all 0 vector and re-run tau.forest.centered = causal_forest(X, Y_centered, W_centered,tune.parameters = TRUE)

So I think there is something wrong of Y_centered or/and W_centered.

To check Y_centered and W_centered, I also try lm(Y_centered ~W_centered) the least square can run normally.

Regards, Mengxia.

ZhangMengxia commented 6 years ago

When I remove "tune.parameters = TRUE", the problem goes away.

halflearned commented 6 years ago

Hi @ZhangMengxia, thanks a lot for the feedback! This is a known bug (see #252), and is caused by our tuning algorithm. We are currently working towards using alternative methods that are more robust. Until then, you'll unfortunately have to use traditional cross-validation methods for tuning.

jtibshirani commented 6 years ago

I'm closing this in favor of #252, but thanks again for the detailed report and apologies about this instability.