Closed ivan-marroquin closed 1 year ago
Hi Ivan,
Thanks for your request! Apologies for the delay in answering your question. This is a relatively easy fix and I will try to incorporate this in the next update scheduled for Feb. 2022. Most likely the error metrics will be added as an attribute to the fitted learner.
Best,
Olivier
Hi Olivier,
Many thanks for taking into consideration my enhancement request!
Ivan
Hi Olivier,
Hoping all is well with you! I noticed that you released a new version of PGBM. Congratulations and many thanks for actively maintaining this package. I was wondering if you had the chance to implement the enhancement request that I submited.
Kind regards, Ivan
Hi @ivan-marroquin,
Sorry for the late reply. I have been ill on and off for quite some time the past 6 months. I released a new version to fix a number of bugs, and I am working on your request, together with the ability to calculate Shapley values. The latter turns out harder than expected so I skip that.
Kind regards,
Olivier
Hi,
This is now included, following this pull request. Will be released later today.
Hi @elephaint
Thanks so much!
Is your feature request related to a problem? Please describe. I think it will be beneficial to PGBM package to get the measure error metric on each trained tree. So then, one can generate plots to assess the performance of the ensemble of trees and diagnostic when this ensemble is overfitting, underfitting or doing well on train/validation data sets.
Describe the solution you'd like To provide an example, I use xgboost regressor model (https://xgboost.readthedocs.io/en/latest/python/python_api.html#module-xgboost.sklearn). It offers the possibility to follow the performance of trained trees with a following statements:
evaluation_set= [(gral_train_inputs, gral_train_targets), (test_inputs, test_targets)]
best_trained_model.fit(X= gral_train_inputs, y= gral_train_targets, eval_metric= ['rmse', 'mae'], eval_set= evaluation_set, verbose= False)
performance= best_trained_model.evals_result()
Then, "performance" is a dictionary that contains the measured error metrics on both datasets for all trained trees.
Kind regards,
Ivan