grf-labs / grf

Generalized Random Forests
https://grf-labs.github.io/grf/
GNU General Public License v3.0
938 stars 250 forks source link

best_linear_projection returns NaN values #1415

Closed pakmasha99 closed 1 month ago

pakmasha99 commented 1 month ago

Hi, I am using grf package for my research, and I'm currently experiencing an issue with the best_linear_projection function.

I'm using a simulated dataset containing 500 observations and 1000 covariates to check the grf performance on high-dimensional datasets. So far, there have been no issues with lower-dimensional datasets and there's also no missing values.

What could be the reason for the best_linear_projection function to return NaN values? Thank you!

image

erikcs commented 1 month ago

Hi @pakmasha99, best_linear_projection fits a linear model using OLS, for that you need more samples than covariates. Fitting the best linear projection on some chosen subset of your X's should work as long as dim(X.subset) < n.

pakmasha99 commented 1 month ago

Thank you!