swager / randomForestCI

This package is DEPRECATED. Please use the packages `grf` or `ranger` instead, which have built-in confidence intervals.
https://github.com/swager/grf
MIT License
69 stars 21 forks source link

fix error integer overflow with large dataset #10

Closed brianstock closed 7 years ago

brianstock commented 7 years ago

Hello,

I was able to use randomForestInfJack to calculate variances for your toy example and a 16000 row dataset, but ran into an error using it on a 217000 row dataset:

Error in if (B^2 > n * nrow(pred)) { :
  missing value where TRUE/FALSE needed
In addition: Warning message:
In n * nrow(pred) : NAs produced by integer overflow

Looking up the issue on stack overflow, figured out it was a simple fix, just needed to change n and nrow(pred) to class numeric. Adding as.numeric to lines 29 and 42 fixed the problem for me.

Thanks for your work - I'm pretty excited to see variance estimates for RF.

-Brian

swager commented 7 years ago

Thank you!