zachmayer / caretEnsemble

caret models all the way down :turtle:
http://zachmayer.github.io/caretEnsemble/
Other
226 stars 75 forks source link

Speed up extractBestPreds 100x #151

Closed zachmayer closed 9 years ago

zachmayer commented 9 years ago

try it! microbenchmark(extractBestPreds(x))

I’m seeing ~115x speedup. Had to make a small change to makePredObsMatrix to use list indexes, rather than data.frame indexes, but it doesn’t really change anything, as data.frames are lists!

jknowles commented 9 years ago

Wow, that's genius!

zachmayer commented 9 years ago

Hahah thanks! I had a dataset that was too big for caretEnsemble, and that was making me sad. So I thought I'd try a data.table, and now I can't believe we didn't try this sooner!

I think both caret and caretEnsemble could benefit from using data.table more heavily to prevent copying of datasets during cross-validation and ensembling. Could save a lot of memory and time.