h2oai / h2o-3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
http://h2o.ai
Apache License 2.0
6.87k stars 2k forks source link

GAM predict: Return each component of the linear predictor #7458

Open exalate-issue-sync[bot] opened 1 year ago

exalate-issue-sync[bot] commented 1 year ago

In the R mgcv package the predict.gam method with type="terms" returns each component of the linear predictor separately.

If that was also possible in (Python) H2O GAM it would be be very useful. Prediction with a model would be simpler, because each term could be implemented separately, and then all terms added to the intercept. To get the final response prediction apply the inverse of the link function to that sum.

There is a small mgcv example.

[^predict.gam_terms.R]

The matrix _b1pred has three columns, corresponding to each term in the formula. That is the split I am looking for.

I would like to get such a H2OFrame out of H2O GAM too. Or is that already possible, and I have I overlooked something that is already there?

exalate-issue-sync[bot] commented 1 year ago

Michal Kurka commented: [~accountid:557058:1f01b471-f37b-40af-bae9-a18b38e24549] please take look

exalate-issue-sync[bot] commented 1 year ago

Wendy commented: Hi Geir Inge Sandnes:

We do not have this feature as we speak. However, this is not a difficult feature to add.

Wendy

exalate-issue-sync[bot] commented 1 year ago

Michal Kurka commented: I think this would be a good roadmap item for 3.36, added to epic [https://h2oai.atlassian.net/browse/PUBDEV-8200|https://h2oai.atlassian.net/browse/PUBDEV-8200|smart-link]

exalate-issue-sync[bot] commented 1 year ago

Geir Inge Sandnes commented: [~accountid:557058:04659f86-fbfe-4d01-90c9-146c34df6ee6] I hope you can soon add the predict_contributions method to H2O GAM and GLM. I work for the largest insurance company in Norway. We use R mgcv::gam for price modelling, where we can get out contributions from each term to the final prediction. The price model can then be implemented in our customer sales system as a product of factors belonging to each modelling term.

We would like to replace mgcv::gam with H2O GAM (or GLM), which has regularized regression. We can not yet do this, until you provide the predict_contributions method. We can not implement a big black box model, where you do not get the contributions from terms, only the total prediction. Such a black box GAM model is not explainable enough and is technically also too heavy to use for predictions. I hope you can help us.

h2o-ops commented 1 year ago

JIRA Issue Details

Jira Issue: PUBDEV-8193 Assignee: Wendy Reporter: Geir Inge Sandnes State: Open Fix Version: N/A Attachments: Available (Count: 1) Development PRs: N/A

h2o-ops commented 1 year ago

Attachments From Jira

Attachment Name: predict.gam_terms.R Attached By: Geir Inge Sandnes File Link:https://h2o-3-jira-github-migration.s3.amazonaws.com/PUBDEV-8193/predict.gam_terms.R