lm-sys / FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
Apache License 2.0
35.53k stars 4.36k forks source link

Fix Semaphore release Issue in `api_generate_stream` Function of vllm_worker #3390

Open coolbeevip opened 3 weeks ago

coolbeevip commented 3 weeks ago

Why are these changes needed?

This PR addresses a potential issue in the api_generate_stream function where the semaphore might not be properly released in all scenarios. Specifically, if an exception occurs during the execution of worker.generate_stream(params), the create_background_tasks(request_id) function might not be called. This could lead to the semaphore not being properly released.

Related issue number (if applicable)

Closes #3389

Checks

coolbeevip commented 3 weeks ago

When you have a moment. Could you please take a look at this PR? @merrymercy @infwinston @BabyChouSr