ClimbsRocks / machineJS

[UNMAINTAINED] Automated machine learning- just give it a data file! Check out the production-ready version of this project at ClimbsRocks/auto_ml
https://github.com/ClimbsRocks/auto_ml
408 stars 64 forks source link

reduce number of times we train each model #159

Closed ClimbsRocks closed 8 years ago

ClimbsRocks commented 8 years ago

there doesn't seem to be a ton of improvement in a given model across different runs of RandomizedSearchCV. It might improve by a couple single-digit percentage points, but not dramatically.

I think the most valuable thing we can do by default is to surface which models are useful, and which ones just aren't. And then, of course, to let ensembling take over.

Single-digit percentage point increases in accuracy are huge in some circumstances, but at that point, we'll let the engineers go in and focus that level of improvement on only the models that matter to them. It's an easy adjustment to make. Maybe write a blog post about this.

ClimbsRocks commented 8 years ago

bumped down to only train each model 3 separate times. should dramatically shorten overall training times