Closed Sushaokun closed 2 months ago
Thank you for reaching out, @Sushaokun ! There was a bug causing CancellationToken
to not work appropriately in streaming chat completions which is fixed as of version 2.0.0-beta.13.
Thank you for reaching out, @Sushaokun ! There was a bug causing
CancellationToken
to not work appropriately in streaming chat completions which is fixed as of version 2.0.0-beta.13.
I downloaded the latest source code (version 2.0.0-beta.13.) and tested it, but the problem still exists.
Service
Vllm
Describe the bug
When communicating with the Vllm service, there is no way to stop the build midway like with python.
Steps to reproduce
When I execute stream.response.close() in Python code, vllm will output "Aborted request chat-xxxx.". In C#, when I call cancelSource.Cancel(), the streaming thread in C# does stop. However, the vllm server will continue to generate until all inferences are completed and end with the output "Finished request chat-xxxx".
Code snippets
OS
Windwos 11
.NET version
8.0
Library version
Prepare 2.0.0-beta.