Open KeithHenry opened 3 weeks ago
Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @jpalvarezl @ralph-msft @trrwilson.
Looks like same thing happens if your input text hits content filter and the response is 400.
Looks like same thing happens if your input text hits content filter and the response is 400.
@tkumpumak yeah - pretty much any response other than 2xx causes it to hang waiting on a stream that won't follow. Does my PromoteHttpStatusErrorsPipelineTransport
hack workaround it for you?
Looks like same thing happens if your input text hits content filter and the response is 400.
@tkumpumak yeah - pretty much any response other than 2xx causes it to hang waiting on a stream that won't follow. Does my
PromoteHttpStatusErrorsPipelineTransport
hack workaround it for you?
Yes your hack works fine, thanks for the fix! I wired cancellationtoken to my CompleteChatStreamingAsync call and it cancels via that.
Library name and version
Azure.AI.OpenAI 2.0.0-beta.2 and 1.0.0-beta.16
Describe the bug
In 2.0.0
OpenAI.Chat.ChatClient.CompleteChatStreamingAsync
hangs forever when the service returns429
too many requests. If youawait
it the code never resumes.In 1.0.0
Azure.AI.OpenAI.OpenAIClient.GetChatCompletionsStreamingAsync
hangs in the same way, but I'll stick to v2.0.0 for the examples below.This is a very common occurrence. Any system using this API needs to handle rate limits.
The workaround for it is to inject a custom pipeline transport:
And in that custom pipeline
override OnReceivedResponse
to handle the429
:When
!httpResponse.IsSuccessStatusCode
the detail is inhttpResponse.Content
, but every way of getting it is async (andOnReceivedResponse
is not). This would mean using a blocking.Result
, but that cascades the failure - busy server starts getting lots of429
means lots of blocked threads means whole service hangs.Expected behavior
CompleteChatStreamingAsync
should not hang. WhenIsSuccessStatusCode == false
that should be handled and the response read to provide error context. When there is not a response being streamed the results should not stall anawait
.This is how the implementation in the chat playground in Azure AI Studio behaves:
This could be an exception thrown, or return a combined result of the streaming collection and the context when not streaming.
If this can't be fixed in this library (say this is bug in the underlying implementation) then
OnReceivedResponse
needs to beasync
and called withawait
so that the response body can be read without blocking.Actual behavior
await chatClient.CompleteChatStreamingAsync
never resolves when the endpoint returns an error HTTP status code.Reproduction Steps
Environment
Tested in VS Code 1.92.2 and Visual Studio Professional 2022 17.10.3