microsoft / semantic-kernel

Integrate cutting-edge LLM technology quickly and easily into your apps
https://aka.ms/semantic-kernel
MIT License
21.36k stars 3.14k forks source link

.Net: IAutoFunctionInvocationFilter termination response isn't consistent with the none termination #6810

Closed EdenTanami closed 2 months ago

EdenTanami commented 3 months ago

Describe the bug The azure open AI chat completion appends to the chat history the assistant's messages with the tool calls and also tools reponse messages. (Same for Streaming and none streaming) In a none termination use case the given chat history is appended with the mentioned messages and the response is the assistant final answer. When setting AutoFunctionInvocationContext.Terminate to true, the chat history is appended in the same way and since there isn't an assistant final answer (due to the termination) the response is the last tool call result. In that case we have the last result twice - in the top of the given chat history and also in the response.

We would expect the response to be null or empty or get some indication that the pipeline was terminated and will know not to append the response to the chat history (and have it twice..)

To Reproduce Steps to reproduce the behavior:

  1. Call to azure open AI chat completion GetChatMessageContentsAsync or GetStreamingChatMessageContentsAsync with messages that will trigger auto function call.
  2. Have registered filter that terminate after the function invocation. public async Task OnAutoFunctionInvocationAsync(AutoFunctionInvocationContext context, Func<AutoFunctionInvocationContext, Task> next) { await next(context); context.Terminate = true; }
  3. Examine the response of the chat completion and chat history it was given and manipulated.
  4. See that the content of last message ad the response's content are the same.

Expected behavior Do not have duplication of the last tool call response. Have built in way to understand a termination occurred in both streaming and none streaming cases.

Screenshots image

image

Platform OS: Windows IDE: Visual Studio, Language: C# Source: SK NuGet package version 1.14.0

dmytrostruk commented 3 months ago

@EdenTanami Thanks for reporting this issue.

We would expect the response to be null or empty

I'm not sure if we want to return null response in case of termination, because if you are using IChatCompletionService.GetChatMessageContentsAsync directly, you have an access to ChatHistory object, while it's not the case when using kernel.InvokePromptAsync, so returning assistant message or function result due to termination is important.

I think it would be better if in none-termination scenario (even without using filter), we would append final assistant response to chat history in all cases + return it as final result from GetChatMessageContentsAsync method. In this case, it won't be required to manually update chat history every time, since it will contain all processed messages. Although, I'm not sure it's possible to apply such logic in current stage, because it will be a breaking change for users who update chat history with final result manually (they will end up with the same assistant message in chat history twice).

Quick overview of current behavior:

Based on that, you need to update your chat history only when your result chat message content has assistant role, and you should have access to the Role property in order to check that.

Let me know if this can cover your scenario. We can always add some metadata property to identify that termination was in place, but I think for now chatMessageContent.Role can work as identifier. Thanks!

EdenTanami commented 3 months ago

Hey @dmytrostruk thanks for the quick response, Splitting my answer so it will be easier to address its parts-

  1. kernel.InvokePromptAsync isn't related to the terminate no? since the IAutoFunctionInvocationFilter is only relevant to the auto invocation loop? I am not familiar with the kernel.InvokePromptAsync so maybe I'm wrong here.

  2. In general I think it should be consistent usage with the none termination case and avoid the need to add termination check when appending the response to the history.. If adding the assistant message to the chat history in the none termination is breaking it means that its not an option? or It something that takes more time?

  3. Anyway, Checking the role is a good solution for the meanwhile, the problem is that when using streaming we don't get back message object, only stream of the content. How you advice to check if should append the response to the history in the streaming scenario?

thanks!

dmytrostruk commented 3 months ago

kernel.InvokePromptAsync isn't related to the terminate no? since the IAutoFunctionInvocationFilter is only relevant to the auto invocation loop? I am not familiar with the kernel.InvokePromptAsync so maybe I'm wrong here.

@EdenTanami kernel.InvokePromptAsync method is another entry point and can be used for function calling functionality. In this case, filters are triggered as well, so termination logic is also available, here is an example: https://github.com/microsoft/semantic-kernel/blob/745e64a0265923c926c26a3e61127433f6506ba1/dotnet/samples/Concepts/Filtering/AutoFunctionInvocationFiltering.cs#L22-L35

If adding the assistant message to the chat history in the none termination is breaking it means that its not an option? or It something that takes more time?

It is breaking, and most probably we won't change this behavior at this stage unless it blocks some scenario, or something is not achievable without this.

the problem is that when using streaming we don't get back message object, only stream of the content

I think in streaming the result type is StreamingChatMessageContent, which is similar to ChatMessageContent type in non-streaming scenario, and it also has Content and Role properties, here is usage example: https://github.com/microsoft/semantic-kernel/blob/745e64a0265923c926c26a3e61127433f6506ba1/dotnet/samples/Concepts/ChatCompletion/OpenAI_ChatCompletionStreaming.cs#L146-L158

We also return Role and Content in streaming termination scenario, so it should be possible to perform similar Role check as in non-streaming case: https://github.com/microsoft/semantic-kernel/blob/745e64a0265923c926c26a3e61127433f6506ba1/dotnet/src/Connectors/Connectors.OpenAI/AzureSdk/ClientCore.cs#L842-L853

Let me know if this is helpful. Thanks!

EdenTanami commented 2 months ago

thanks!