EpistasisLab / tpot

A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.
http://epistasislab.github.io/tpot/
GNU Lesser General Public License v3.0
9.72k stars 1.57k forks source link

Documentation - TPOT vs sklearn - coverage #204

Open mglowacki100 opened 8 years ago

mglowacki100 commented 8 years ago

It would be nice to have table with TPOT vs sklearn operators. AFAIK not all operators from sklearn are included in tpot. It could be used as:

rhiever commented 8 years ago

Good idea. I've added this to the enhancements list.

danthedaniel commented 8 years ago

This is our current coverage:

westonplatter commented 8 years ago

@teaearlgraycold for someone coming brand new into the project, is there a good example of what needs to be done for the list you just posted?

danthedaniel commented 8 years ago

@westonplatter - here's the code I used

from sklearn import * # Needed to discover all subclasses
import tpot
import sklearn

def all_subclasses(cls):
    return cls.__subclasses__() + [g for s in cls.__subclasses__() for g in all_subclasses(s)]

tpot_estimators = set([x.__name__ for x in tpot.operators.Operator.inheritors()])
sklearn_estimators = set([x.__name__ for x in all_subclasses(sklearn.base.BaseEstimator)
    if x.__name__[0] != '_' and not x.__name__.startswith("Base")])

for est in sklearn_estimators:
    marker = 'X' if est in tpot_estimators else ' '
    print("- [{}] {}".format(marker, est))
westonplatter commented 8 years ago

@teaearlgraycold thanks for the example. I'll take a look.

danthedaniel commented 8 years ago

@westonplatter, that code requires you use the version of TPOT that is currently in the development branch (0.5).