elephaint / pgbm

Probabilistic Gradient Boosting Machines
Apache License 2.0
138 stars 20 forks source link

Get error metrics for each trained tree #11

Closed ivan-marroquin closed 1 year ago

ivan-marroquin commented 2 years ago

Is your feature request related to a problem? Please describe. I think it will be beneficial to PGBM package to get the measure error metric on each trained tree. So then, one can generate plots to assess the performance of the ensemble of trees and diagnostic when this ensemble is overfitting, underfitting or doing well on train/validation data sets.

Describe the solution you'd like To provide an example, I use xgboost regressor model (https://xgboost.readthedocs.io/en/latest/python/python_api.html#module-xgboost.sklearn). It offers the possibility to follow the performance of trained trees with a following statements:

evaluation_set= [(gral_train_inputs, gral_train_targets), (test_inputs, test_targets)]

best_trained_model.fit(X= gral_train_inputs, y= gral_train_targets, eval_metric= ['rmse', 'mae'], eval_set= evaluation_set, verbose= False)

performance= best_trained_model.evals_result()

Then, "performance" is a dictionary that contains the measured error metrics on both datasets for all trained trees.

Kind regards,

Ivan

elephaint commented 2 years ago

Hi Ivan,

Thanks for your request! Apologies for the delay in answering your question. This is a relatively easy fix and I will try to incorporate this in the next update scheduled for Feb. 2022. Most likely the error metrics will be added as an attribute to the fitted learner.

Best,

Olivier

ivan-marroquin commented 2 years ago

Hi Olivier,

Many thanks for taking into consideration my enhancement request!

Ivan

ivan-marroquin commented 2 years ago

Hi Olivier,

Hoping all is well with you! I noticed that you released a new version of PGBM. Congratulations and many thanks for actively maintaining this package. I was wondering if you had the chance to implement the enhancement request that I submited.

Kind regards, Ivan

elephaint commented 2 years ago

Hi @ivan-marroquin,

Sorry for the late reply. I have been ill on and off for quite some time the past 6 months. I released a new version to fix a number of bugs, and I am working on your request, together with the ability to calculate Shapley values. The latter turns out harder than expected so I skip that.

Kind regards,

Olivier

elephaint commented 1 year ago

Hi,

This is now included, following this pull request. Will be released later today.

ivan-marroquin commented 1 year ago

Hi @elephaint

Thanks so much!