h2oai / h2o-3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
http://h2o.ai
Apache License 2.0
6.92k stars 2k forks source link

Create Custom Loss Metric for GBM #10966

Closed exalate-issue-sync[bot] closed 1 year ago

exalate-issue-sync[bot] commented 1 year ago

ref - https://support.h2o.ai/helpdesk/tickets/90967

custom loss metric:

We don’t want H2O to implement the metrics in the code because tomorrow we might have a different metric. Also the implementation for the metric will change if I choose to another target function (for instance log vs ratio etc). It would be best if you could expose endpoints for us to code these custom evaluation metrics ourselves as per the need.

notes: this allows for more specialized business metrics that help with reporting needs

the user would need to provide first and second derivative for the loss function.

exalate-issue-sync[bot] commented 1 year ago

Michal Malohlava commented: It would be nice to know more details - have an example - Python/R code (CC: [~accountid:557058:eac185dd-5a5c-46e9-bb5a-13217ee9c218] [~accountid:557058:78faca06-cede-4cd5-8617-26bd94ec504c] )

exalate-issue-sync[bot] commented 1 year ago

Lauren DiPerna commented: Here's an example of how it works for xgboost, which does it well:

python sudo code {code} import xgboost as xgb dtrain = xgb.DMatrix('train.txt')

def rmsle(y_actual, y_predicted): error = np.sqrt(np.mean(np.power(np.log1p(y_actual)-np.log1p(y_predicted), 2))) return error

... param = {'bst:max_depth':2, 'bst:eta':1, 'silent':1, 'objective':'binary:logistic' } plst = param.items() num_round = 10

bst = xgb.train( plst, dtrain, num_round, feval = rmsle ) {code}

similarly for sudo code for R {code} require(xgboost) train <- dataset

another option could be to allow the model and test data to be passed instead of the actual and predicted y

rmsle < - function(model, test_data) { y <- model$y y.pred <- predict(model, test_data) return(sqrt(1/length(y)*sum((log(y.pred +1)-log(mydat$a +1))^2))) }

xgboost(data = train$data, label = train$label, feval = rmsle nround = 2, )

{code}

exalate-issue-sync[bot] commented 1 year ago

Javier Recasens commented: This feature would be greatly appreciated. I need to create a custom loss function that penalizes under forecasting heavily (compared to over forecasting).

exalate-issue-sync[bot] commented 1 year ago

Nidhi Mehta commented: #90967 (https://support.h2o.ai/helpdesk/tickets/90967) - Create Custom evaluation metric and Loss Metric

exalate-issue-sync[bot] commented 1 year ago

Nidhi Mehta commented: #92656 (https://support.h2o.ai/helpdesk/tickets/92656) - customized loss function

h2o-ops commented 1 year ago

JIRA Issue Migration Info

Jira Issue: PUBDEV-4076 Assignee: Veronika Maurerová Reporter: Lauren DiPerna State: Resolved Fix Version: 3.26.0.1 Attachments: N/A Development PRs: Available

Linked PRs from JIRA

https://github.com/h2oai/h2o-3/pull/3653 https://github.com/h2oai/h2o-3/pull/3493 https://github.com/h2oai/h2o-3/pull/3509