Chainlit / chainlit

Build Conversational AI in minutes ⚡️
https://docs.chainlit.io
Apache License 2.0
6.03k stars 770 forks source link

Optimize server performance by utilizing multiple workers #579

Open fvaleye opened 7 months ago

fvaleye commented 7 months ago

Currently, the Chainlit server initialization is configured to use only a single worker, as seen in the chainlit/cli/init.py. This setup does not take full advantage of the available CPU resources, particularly in a production environment.

To improve the server's performance and efficiency, it is proposed to configure the server to launch with multiple workers. This change aims to better utilize the underlying hardware's CPU capabilities, potentially enhancing the application's ability to handle concurrent requests and workloads.

Implementation Reference For guidance on implementing this enhancement, the Uvicorn deployment documentation provides valuable information on how to run Uvicorn with Gunicorn, using multiple worker processes.

Expected Outcome By addressing this issue, we anticipate a more scalable and responsive server that can maintain performance under higher loads. This optimization is particularly crucial for production deployments where maximizing resource utilization is essential.

deepzliu commented 4 months ago

uvicorn.Config has 'workers' parameter, how about exposure in chainlit?

Sm1Ling commented 1 month ago

Yep, support this idea! Currently, Chainlit freezes while operating/loading some big LLM. And that causes further bugs with Messages sanding