lm-sys / FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
Apache License 2.0
35.53k stars 4.36k forks source link

Semaphore release Issue in api_generate_stream Function of vllm_worker #3389

Open coolbeevip opened 3 weeks ago

coolbeevip commented 3 weeks ago

I've been examining the api_generate_stream function in the fastchat/serve/vllm_worker.py file and I've noticed a potential issue related to the semaphore release.

In the current implementation, if an exception occurs during the execution of worker.generate_stream(params)

https://github.com/lm-sys/FastChat/blob/main/fastchat/serve/vllm_worker.py#L205

the create_background_tasks(request_id) function might not be called. This could lead to the semaphore not being properly released.