potash / drain

pipeline library
MIT License
12 stars 5 forks source link

Streamline exploration #32

Closed potash closed 7 years ago

potash commented 7 years ago

A few changes to make exploration more streamlined.

First and foremost, use a global step cache. This means that a given step's results are only loaded once (unless the cache is reset). Makes exploration of multiple, overlapping workflows much more efficient. Also good for reloading steps failed runs.

Minor changes:

The point is you can now do:

>>> import drain
>>> from drain import model
>>> from test_drain import n_estimators_search
>>> drain.explore(n_estimators_search)\
         .dapply(model.precision, k=[50,100,200])
k              100    200       300
n_estimators                       
1             0.81  0.840  0.836667
2             0.82  0.845  0.713333
3             0.90  0.815  0.723333

And future calls to explore(n_estimators_search) will not reload those results.