Closed zachmayer closed 9 years ago
Wow, that's genius!
Hahah thanks! I had a dataset that was too big for caretEnsemble, and that was making me sad. So I thought I'd try a data.table, and now I can't believe we didn't try this sooner!
I think both caret and caretEnsemble could benefit from using data.table more heavily to prevent copying of datasets during cross-validation and ensembling. Could save a lot of memory and time.
try it! microbenchmark(extractBestPreds(x))
I’m seeing ~115x speedup. Had to make a small change to makePredObsMatrix to use list indexes, rather than data.frame indexes, but it doesn’t really change anything, as data.frames are lists!