shane-kercheval / oo-learning

Python machine learning library based on Object Oriented design principles; the goal is to allow users to quickly explore data and search for top machine learning algorithm candidates for a given dataset
MIT License
1 stars 0 forks source link

Utilize same splits for all models in Searcher/Tuner/Stacker/etc. #1

Open shane-kercheval opened 6 years ago

shane-kercheval commented 6 years ago

In some of the model_aggregation objects, we are, for example, doing repeated cross-validation across multiple models, and I'm wondering if there is an opportunity for increasing performance. Currently, for example, the ModelStacker (and Tuner, and therefore, Searcher), uses multiple Resampler objects for each corresponding model. Rather than looping hundreds of times (depending on the number of models), I wonder if we could take advantage of the same splitting (.e.g. per repeat/fold index), and transform/training/predict/etc. across all models once per fold. This would probably involve refactoring the Resampler to take multiple models or ModelInfos.