ecpolley / SuperLearner

Current version of the SuperLearner R package
272 stars 72 forks source link

Superlearner very slow #118

Closed xhw197 closed 5 years ago

xhw197 commented 5 years ago

Hi,

I am using Superlearner to build a predictive model. i have about 700,000 rows and 60 columns. But for some reason, it runs very slow. i am using a parallel computing with 24 cores. Does anyone have any idea to improve the speed? thank you!

ecpolley commented 5 years ago

Hi, The SuperLearner R package isn't optimized for very large datasets. A few options might be to subsample your dataset and train on a smaller sample size. Another option is to check out the H2O implementation by @ledell (http://docs.h2o.ai/h2o/latest-stable/h2o-docs/data-science/stacked-ensembles.html).