Streaming doesn't work properly in Blazor WASM

MoienTajik commented 2 weeks ago

There is a problem with streaming using IAsyncEnumerable in Blazor WASM. When consuming the stream in a console app or any environment other than Blazor WASM, the items are processed one by one as they are fetched. However, in Blazor WASM, the stream behaves differently, and the response is only available after all items are fetched from the remote resource.

The issue affects both ChatClient.CompleteChatStreamingAsync and AssistantClient.CreateRunStreamingAsync methods (and potentially any other streaming methods).

Steps to reproduce:

Set up a Blazor WASM project.
Use the following code snippet to fetch streaming updates from a remote resource.
Observe the behavior difference compared to a console app.

Code:

var messageContent = MessageContent.FromText(prompt);

var streamingUpdates = assistantClient.CreateRunStreamingAsync(
    thread.Id,
    assistant.Id,
    new()
    {
        AdditionalMessages = { new([messageContent]) }
    }
);

await foreach (var streamingUpdate in streamingUpdates)
{
    switch (streamingUpdate.UpdateKind)
    {
        case StreamingUpdateReason.MessageCreated:
            Console.WriteLine($"Message created: {DateTimeOffset.Now:O}");
            break;

        case StreamingUpdateReason.MessageUpdated when streamingUpdate is MessageContentUpdate messageContent:
            Console.WriteLine($"Message updated: {messageContent.Text} -- {DateTimeOffset.Now:O}");
            break;
    }
}

Expected Behavior:

Each item should be processed as it is fetched from the remote resource, similar to the behavior observed in a console app.

Actual Behavior:

In Blazor WASM, the stream processes all items only after they are completely fetched, rather than one by one.

Environment:

Blazor WASM on .NET 8
OpenAI NuGet package version: 2.0.0-beta.5

Additional Information:

The problem is detailed in this blog post. The behavior discrepancy between Blazor WASM and other environments needs to be addressed to ensure consistent streaming functionality by setting SetBrowserResponseStreamingEnabled(true) and HttpCompletionOption.ResponseHeadersRead on the HttpRequestMessage.

KrzysztofCwalina commented 1 week ago

@annelo-msft, we need allow setting/calling HttpRequestMessage.SetBrowserResponseStreamingEnabled(true) on HttpClient's messages to fix this problem.

annelo-msft commented 6 days ago

Tracking with https://github.com/Azure/azure-sdk-for-net/issues/44706

annelo-msft commented 6 days ago

@KrzysztofCwalina, I believe we may already have the method we need on HttpClientPipelineTransport in the OnSendingRequest method.

For example, following the repro steps above, I am able to create a sample project and add a sample transport implementation that overrides OnSendingRequest:

public class BlazorHttpClientTransport : HttpClientPipelineTransport
{
    protected override void OnSendingRequest(PipelineMessage message, HttpRequestMessage httpRequest)
    {
        httpRequest.SetBrowserResponseStreamingEnabled(true);
    }
}

The default SCM transport already passes HttpCompletionOption.ResponseHeadersRead to HttpClient.Send, so I believe that requirement should be addressed without any additional customization to the transport implementation.

That can then be added to the client by passing an instance of OpenAIClientOptions as follows:

OpenAIClientOptions options = new();
options.Transport = new BlazorHttpClientTransport();
ChatClient client = new(model: "gpt-4o", apiKey, options);

If this doesn't address the issue as you were thinking @KrzysztofCwalina, let me know.

Or @MoienTajik, if this doesn't address the problem you're seeing, I'm happy to dig further into needs here.

Thanks!

MoienTajik commented 5 days ago

Thanks, Anne, for investigating this! I can confirm that this works and solves the problem.

openai / openai-dotnet