GetStreamingChatMessageContentsAsync not working

Kmasterrr commented 6 months ago

Hi there,

Thanks so much for the connector. Please could you describe how to get message streaming working?

I have set up the project accordingly and when attempting to stream i receive an error:

result = chatCompletionService.GetStreamingChatMessageContentsAsync( history, executionSettings: _openAIPromptExecutionSettings, kernel: kernel);

System.Text.Json.JsonException HResult=0x80131500 Message='{' is invalid after a single JSON value. Expected end of data. Path: $ | LineNumber: 1 | BytePositionInLine: 0. Source=System.Text.Json StackTrace: at System.Text.Json.ThrowHelper.ReThrowWithPath(ReadStack& state, JsonReaderException ex) at System.Text.Json.Serialization.JsonConverter1.ReadCore(Utf8JsonReader& reader, JsonSerializerOptions options, ReadStack& state) at System.Text.Json.JsonSerializer.ReadFromSpan[TValue](ReadOnlySpan1 utf8Json, JsonTypeInfo1 jsonTypeInfo, Nullable1 actualByteCount) at System.Text.Json.JsonSerializer.ReadFromSpan[TValue](ReadOnlySpan1 json, JsonTypeInfo1 jsonTypeInfo) at System.Text.Json.JsonSerializer.Deserialize[TValue](String json, JsonSerializerOptions options) at Codeblaze.SemanticKernel.Connectors.Ollama.OllamaChatCompletionService.d2.MoveNext() at Codeblaze.SemanticKernel.Connectors.Ollama.OllamaChatCompletionService.d2.System.Threading.Tasks.Sources.IValueTaskSource.GetResult(Int16 token) at xxx.d2.MoveNext() in xxx at xxx.d2.MoveNext() in xxx at Program.<

$>d__0.MoveNext() in xxxx

This exception was originally thrown at this call stack: [External Code]

Inner Exception 1: JsonReaderException: '{' is invalid after a single JSON value. Expected end of data. LineNumber: 1 | BytePositionInLine: 0.

william-daconceicao commented 4 months ago

Can you make sure that when you provide the Ollama API URL, you're not using a trailing slash in the URL?

mbaske commented 4 months ago

Hi - Same here. There seems to be an issue with the jsonResponse variable content in https://github.com/BLaZeKiLL/Codeblaze.SemanticKernel/blob/main/dotnet/Codeblaze.SemanticKernel.Connectors.Ollama/ChatCompletion/OllamaChatCompletionService.cs#L67

When I try to run a streaming chat completion, the variable contains all response chunks: {"model":"llama3:latest","created_at":"2024-06-14T08:19:40.607016404Z","message":{"role":"assistant","content":"The"},"done":false} {"model":"llama3:latest","created_at":"2024-06-14T08:19:40.618778759Z","message":{"role":"assistant","content":" sky"},"done":false} {"model":"llama3:latest","created_at":"2024-06-14T08:19:40.630561283Z","message":{"role":"assistant","content":" appears"},"done":false} {"model":"llama3:latest","created_at":"2024-06-14T08:19:40.642207649Z","message":{"role":"assistant","content":" blue"},"done":false} {"model":"llama3:latest","created_at":"2024-06-14T08:19:40.653852956Z","message":{"role":"assistant","content":" because"},"done":false} {"model":"llama3:latest","created_at":"2024-06-14T08:19:40.665474375Z","message":{"role":"assistant","content":" of"},"done":false} {"model":"llama3:latest","created_at":"2024-06-14T08:19:40.677132728Z","message":{"role":"assistant","content":" a"},"done":false} {"model":"llama3:latest","created_at":"2024-06-14T08:19:40.688783164Z","message":{"role":"assistant","content":" phenomenon"},"done":false} {"model":"llama3:latest","created_at":"2024-06-14T08:19:40.700451403Z","message":{"role":"assistant","content":" called"},"done":false} {"model":"llama3:latest","created_at":"2024-06-14T08:19:40.712002743Z","message":{"role":"assistant","content":" Ray"},"done":false} {"model":"llama3:latest","created_at":"2024-06-14T08:19:40.72352729Z","message":{"role":"assistant","content":"leigh"},"done":false} {"model":"llama3:latest","created_at":"2024-06-14T08:19:40.735108166Z","message":{"role":"assistant","content":" scattering"},"done":false} {"model":"llama3:latest","created_at":"2024-06-14T08:19:40.746673503Z","message":{"role":"assistant","content":","},"done":false} {"model":"llama3:latest","created_at":"2024-06-14T08:19:40.758123132Z","message":{"role":"assistant","content":" where"},"done":false} {"model":"llama3:latest","created_at":"2024-06-14T08:19:40.769549156Z","message":{"role":"assistant","content":" shorter"},"done":false} {"model":"llama3:latest","created_at":"2024-06-14T08:19:40.780985197Z","message":{"role":"assistant","content":" blue"},"done":false} {"model":"llama3:latest","created_at":"2024-06-14T08:19:40.792404703Z","message":{"role":"assistant","content":" wavelengths"},"done":false} {"model":"llama3:latest","created_at":"2024-06-14T08:19:40.803854295Z","message":{"role":"assistant","content":" scatter"},"done":false} {"model":"llama3:latest","created_at":"2024-06-14T08:19:40.815322862Z","message":{"role":"assistant","content":" more"},"done":false} {"model":"llama3:latest","created_at":"2024-06-14T08:19:40.826758498Z","message":{"role":"assistant","content":" than"},"done":false} {"model":"llama3:latest","created_at":"2024-06-14T08:19:40.838206235Z","message":{"role":"assistant","content":" longer"},"done":false} {"model":"llama3:latest","created_at":"2024-06-14T08:19:40.849689916Z","message":{"role":"assistant","content":" ones"},"done":false} {"model":"llama3:latest","created_at":"2024-06-14T08:19:40.861138488Z","message":{"role":"assistant","content":"."},"done":false} {"model":"llama3:latest","created_at":"2024-06-14T08:19:40.872654676Z","message":{"role":"assistant","content":""},"done_reason":"stop","done":true,"total_duration":497213323,"load_duration":2548022,"prompt_eval_count":27,"prompt_eval_duration":98577000,"eval_count":24,"eval_duration":265580000}

which can't be deserialized to a single OllamaChatResponseMessage instance.

magols commented 4 months ago

I have an improved version based on @william-daconceicao's work #12 in #13 so that the (streamed) content of the response can start to be read as soon as the HTTP headers have been received.

See before/after here https://www.youtube.com/watch?v=toSgN-HUOIQ

BLaZeKiLL commented 4 months ago

Thanks for this, I have been working on getting the Ollama plugin merged in the source repository, so haven't tested this for a while but I have merged #13

BLaZeKiLL / Codeblaze.SemanticKernel

GetStreamingChatMessageContentsAsync not working #11