alteryx / evalml

EvalML is an AutoML library written in python.
https://evalml.alteryx.com
BSD 3-Clause "New" or "Revised" License
734 stars 83 forks source link

Add ccp_alpha for pruning to Tree based estimators #2007

Open ParthivNaresh opened 3 years ago

ParthivNaresh commented 3 years ago

As of sklearn version 0.22, ccp_alpha has been added as a pruning parameter for Decision Trees, Extra Trees, and Random Forests.

Adding this as a hyperparameter would give AutoML an additional parameter to iterate over and prevent overfitting which is a common issue with trees that become too large.

https://scikit-learn.org/stable/modules/tree.html#minimal-cost-complexity-pruning

This could be broken up into 3 issues, one for each estimator class.

dsherry commented 3 years ago

This could be broken up into 3 issues, one for each estimator class.

Let's start with one of our tree-based estimators and demonstrate a performance improvement. Then we can file issues and get to the others.