microsoft / LightGBM

A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
https://lightgbm.readthedocs.io/en/latest/
MIT License
16.49k stars 3.82k forks source link

C API refitting #6430

Open eightreal opened 4 months ago

eightreal commented 4 months ago

Hello , I have see the LGBM_BoosterRefit api in C-API is there more description about it? a example code is much better, I 'm try to do continue learning by new data , but there is few doc.

eightreal commented 4 months ago

In python , I see some example code like

refit(data=x_test, label=y_test)

but , how it work in C_Api, how can I get the input

const int32_t* leaf_preds,
int32_t nrow,
int32_t ncol
eightreal commented 4 months ago

I read the python source code So, I need call the LGBM_BoosterPredictForMat to get the leaf index (by my new dataset ) the input leaf index for refit , Is the workflow correct? and whether should I call the LGBM_BoosterResetTrainingData before call refit?

eightreal commented 4 months ago

OH, I also see you have call a new booster and merge.

eightreal commented 3 months ago

is there any member can help answer my question?

jameslamb commented 3 months ago

Please look through what Booster.refit() in the Python package does.

https://github.com/microsoft/LightGBM/blob/88cec4776e621ac93f9ba03aa0015035570545da/python-package/lightgbm/basic.py#L4747

eightreal commented 3 months ago

I can do refit by this workflow, but I check that you create a empty model and merge old model, so can I reset training dataset directly?