siboehm / lleaves

Compiler for LightGBM gradient-boosted trees, based on LLVM. Speeds up prediction by ≥10x.
https://lleaves.readthedocs.io/en/latest/
MIT License
364 stars 29 forks source link

Compiled model sharing among processes #20

Closed shubh0508 closed 2 years ago

shubh0508 commented 2 years ago

I am using FastAPI with Gunicorn as my python application. The standard python-based lightgbm modules inference takes around 1.5GB-2GB for 3 Guicorn processes, but LLeaves is taking around 9GB for a single gunicorn worker. If I want to use 3 Gunicorn workers, will I have to increase the instance size to 32GB?

Can I reduce the memory use somehow or can I share a compiled model among multiple Gunicorn workers or python processes?

Also, from what I understood, it seems that we are required to compile the model every time when the application restarts?

shubh0508 commented 2 years ago

Closing this request. Found ways to resolve it with preload feature in gunicorn.

siboehm commented 2 years ago

Just a brief note here: