h2oai / h2o-3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
http://h2o.ai
Apache License 2.0
6.81k stars 1.99k forks source link

Automatic Interaction Detection for GBM #10105

Open exalate-issue-sync[bot] opened 1 year ago

exalate-issue-sync[bot] commented 1 year ago

For linear models, force additive models and compare to unconstrained model. For GBM, use Friedman's H-statistic.

More detail to be provided ad needed basis after further conversation with CapitalOne.

exalate-issue-sync[bot] commented 1 year ago

Venkatesh Yadav commented: Andy confirmed that this would be something they are interested in with Medium priority.

exalate-issue-sync[bot] commented 1 year ago

Arno Candel commented: Just dumping this here (WIP): {code} --- a/h2o-algos/src/main/java/hex/tree/SharedTree.java +++ b/h2o-algos/src/main/java/hex/tree/SharedTree.java @@ -475,6 +475,11 @@ public abstract class SharedTree<M extends SharedTreeModel< M,P,O>, P extends Sha DTree.Split s = dn._split; // Accumulate squared error improvements p er variable float improvement = (float)(s.pre_split_se()-s.se()); assert(improvement>=0);

exalate-issue-sync[bot] commented 1 year ago

Arno Candel commented: Similar to xgbfi, which does it via post-processing. We would want to do this during runtime, to show higher-order importances in real-time. p=2 might be enough.

h2o-ops commented 1 year ago

JIRA Issue Migration Info

Jira Issue: PUBDEV-3183 Assignee: Arno Candel Reporter: Venkatesh Yadav State: Open Fix Version: N/A Attachments: N/A Development PRs: N/A