EpistasisLab / tpot

A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.
http://epistasislab.github.io/tpot/
GNU Lesser General Public License v3.0
9.65k stars 1.56k forks source link

Class that scales both X and Y? #824

Open agsci2017 opened 5 years ago

agsci2017 commented 5 years ago

Hello,

In the source code(of TPOT and sklearn), i can't find any class that transforms both X and Y (scale of target is necessary for Multi-layer Perceptron regressor).

How this class (that scales both X and Y) should be organized? Does TransformerMixin support this?

weixuanfu commented 5 years ago

I think it is related to this issue and I think transforming both X and y is not supported in sklearn-API.

MisterVulcan commented 5 years ago

sklearn.compose.TransformedTargetRegressor is an example of how this may work as a meta-estimator compliant with the sklearn API; another option might be Pipegraph with a TPOTRegressor's Y input/output connected to a separate transformer "node." I don't know if tpot classes are compatible with pipegraph, though; am planning on trying this out eventually.

AFAIK seglearn is the only sklearn-related project that extends sklearn.pipeline.Pipeline to explicitly allow transformers to modify targets without the use of meta-estimators, has its own custom Pype class that allows this. You might follow their schema to write a custom Transformer, but with the understanding that the resulting object class will not work in a vanilla Pipeline. This might not be what you want, depending on use case.