EpistasisLab / tpot

A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.
http://epistasislab.github.io/tpot/
GNU Lesser General Public License v3.0
9.69k stars 1.57k forks source link

StackingEstimator if template is used #1134

Open hanshupe opened 3 years ago

hanshupe commented 3 years ago

Is it the following behaviour expected? I start tpot with a fixed template "Selector-Transformer-Regressor" but then I get a StackingEstimator model. I am not completely sure, but I think I never experienced that behaviour before the latest tpot update.

The final pipeline is:

[('rfe', RFE(estimator=LassoCV(alphas=None, copy_X=True, cv=5, eps=0.01,
                      fit_intercept=True, max_iter=1121, n_alphas=100,
                      n_jobs=None, normalize=True, positive=False,
                      precompute='auto', random_state=None, selection='cyclic',
                      tol=0.01, verbose=False),
    n_features_to_select=6, step=0.11000000000000001, verbose=0)), ('stackingestimator', StackingEstimator(estimator=Earth(allow_linear=None, allow_missing=False,
                                  check_every=None, enable_pruning=True,
                                  endspan=None, endspan_alpha=None, fast_K=None,
                                  fast_h=None, feature_importance_type=None,
                                  max_degree=1, max_terms=None,
                                  min_search_points=None, minspan=None,
                                  minspan_alpha=None, penalty=None, smooth=None,
                                  thresh=None, use_fast=None, verbose=0,
                                  zero_tol=None))), ('elasticnetcv', ElasticNetCV(alphas=None, copy_X=True, cv=4, eps=0.1, fit_intercept=False,
             l1_ratio=0.73, max_iter=1232, n_alphas=100, n_jobs=None,
             normalize=True, positive=False, precompute='auto',
             random_state=None, selection='random', tol=0.01, verbose=0))]
weixuanfu commented 3 years ago

Hmm, Earth can be both regressor and transformer (also see this notebook), which is very abnormal. I will refine the logic in TPOT to set it as regressor instead of transformer (regressor type has higher priority than transformer).

hanshupe commented 3 years ago

Okay strange, don't know why it should be a transformer.

weixuanfu commented 3 years ago

I made a patch about this issue in v0.11.6.post1. Please check this version.

hanshupe commented 3 years ago

Thx!