Compiled model sharing among processes

shubh0508 commented 2 years ago

I am using FastAPI with Gunicorn as my python application. The standard python-based lightgbm modules inference takes around 1.5GB-2GB for 3 Guicorn processes, but LLeaves is taking around 9GB for a single gunicorn worker. If I want to use 3 Gunicorn workers, will I have to increase the instance size to 32GB?

Can I reduce the memory use somehow or can I share a compiled model among multiple Gunicorn workers or python processes?

Also, from what I understood, it seems that we are required to compile the model every time when the application restarts?

shubh0508 commented 2 years ago

Closing this request. Found ways to resolve it with preload feature in gunicorn.

siboehm commented 2 years ago

Just a brief note here:

That does sound like a lot of memory usage, are the models / data that you're working with very large?
No need to recompile the model every time! The compile() function has a cache= parameter, that takes a filepath for caching a previously compiled model. More info in the docs

siboehm / lleaves

Compiled model sharing among processes #20