Open DmitrySorda opened 7 months ago
Hi, I will need to look at this a little more. Do you know for this dataset, what the minimum to maximum memory usage is for either light gbm or XGBoost? For 20 XGBoost trees is it also 2.9 GB, also for these other libraries; does it increase and stay high? Or does it go down after training?
My first hunch is how parallelism is happening, but that wouldn’t explain why these libraries increase with the number of iterations continually.
@DmitrySorda While I would still be curious to know about the stats on this data when compared to xgboost/lightGBM, the more I Think about it, the more what you are seeing with forust makes sense. The data is created once, and then references to that data are passed around during training, the actual trees themselves are rather small in memory, so it's not really surprising to me that you wouldn't see RAM changing much with a large model verses a small model, because most of the RAM usage is largely just the initial data being passed to the model.
Hi @DmitrySorda any more thoughts, or additional info you can share, otherwise I’ll likely close this. Thanks
LightGBM might be keeping grad
and hess
stats of every boosting round. You can check if the total increase is around 32*2*n_rows*n_rounds
.
Hi there!
I'm using the forust library and have noticed a curious behavior regarding memory consumption. Unlike popular libraries like XGBoost and LightGBM, where memory usage increases significantly with a higher number of trees (controlled by num_iterations or n_estimators), the memory footprint of GradientBooster seems to remain constant.
For example, setting
.set_iterations(20);
or.set_iterations(10000);
results in the same memory usage (around 2.9 GB) on my dataset.Here's how I'm setting up the model:
Could you shed some light on why this is happening? Is there a specific mechanism within the library that manages memory differently compared to XGBoost and LightGBM?
I'm interested in understanding the underlying reasons for this behavior and any potential implications it might have on performance or scalability.