How does LightGBM update score?

sunkun1997 commented 2 years ago

In the first iteration, init_score is calculated as follows

double pavg = suml / sumw;
pavg = std::min(pavg, 1.0 - kEpsilon);
pavg = std::max<double>(pavg, kEpsilon);
double initscore = std::log(pavg / (1.0f - pavg)) / sigmoid_;

I just want to konw how does LightGBM update score in the next iteration.

As far as I know, The score of data split into the same leaf node is exactly the same.

jameslamb commented 2 years ago

Thanks for using LightGBM.

LightGBM training produces a collection of trees, and the score for an observation is obtained by summing the initial score and the output produced for that observation by each of the trees.

In each boosting round, LightGBM trains a tree to try to explain the errors that the model is still making (e.g., to minimize the training loss).

If you're new to using gradient boosting with trees, please see this excellent description of the process from our friends at XGBoost: https://xgboost.readthedocs.io/en/stable/tutorials/model.html#additive-training.

If you're looking for LightGBM specifics and are comfortable reading C++ code, see the implementation of GBDT::TrainOneIter(): https://github.com/microsoft/LightGBM/blob/865c126a1e3ccdd77ec205b9dde46e5f3c5b6b21/src/boosting/gbdt.cpp#L432

sunkun1997 commented 2 years ago

Thank you for providing such useful information, I will learn how it works.

jameslamb commented 2 years ago

No problem, happy to help!

github-actions[bot] commented 1 year ago

This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

microsoft / LightGBM

How does LightGBM update score? #5436