microsoft / semantic-kernel

Integrate cutting-edge LLM technology quickly and easily into your apps
https://aka.ms/semantic-kernel
MIT License
22.07k stars 3.29k forks source link

.Net: Bug: Function calling does not work in GetStreamingChatMessageContentsAsync() when using Ollama #9752

Closed Vold-Hal closed 4 days ago

Vold-Hal commented 5 days ago

Describe the bug When using GetStreamingChatMessageContentsAsync() with Ollama models, the response is returned as a JSON object containing the function invocation details (name and parameters), but the tool function is not automatically invoked. GetChatMessageContentAsync() usually triggers the plugin function. The issue may be caused by the way responses are segmented and handled in streaming mode, potentially leading non-executed function calls.

To Reproduce

using System.ComponentModel;
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.ChatCompletion;
using Microsoft.SemanticKernel.Connectors.OpenAI;

class Program()
{
    static async Task Main()
    {
        IKernelBuilder builder = Kernel.CreateBuilder()
                .AddOpenAIChatCompletion(modelId: "llama3.2", apiKey: "ollama", endpoint: new Uri("http://localhost:11434/v1"));

        //Plugins
        builder.Plugins.AddFromType<EmbeddingPlugin>();

        Kernel kernel = builder.Build();
        IChatCompletionService chatService = kernel.GetRequiredService<IChatCompletionService>();

        ChatHistory chatHistory = new ChatHistory();
        chatHistory.AddUserMessage("Can you get vector from string \"example text\"");
        var completion = chatService.GetStreamingChatMessageContentsAsync(
            chatHistory,
            executionSettings: new OpenAIPromptExecutionSettings()
            {
                ToolCallBehavior  = ToolCallBehavior.AutoInvokeKernelFunctions,
            }
            ,kernel: kernel
        );

        string answer = "";
        Console.WriteLine();
        await foreach (var content in completion)
        {
            answer += content.Content;
            Console.Write(content.Content);
        }
    }
    public class EmbeddingPlugin
    {
        public EmbeddingPlugin()
        {
        }

        [KernelFunction("get_vector_from_string")]
        [Description("embeds text into a vector using ai model and returns vector.")]
        [return: Description("Array created based on vectorised data")]
        public string GetVectorFromString(string text)
        {
            return "[.56, .84, .69, .18, .41]";
        }

    }
}

Expected behavior A function should be called, instead it just outputs "{"name": "EmbeddingPlugin-get_vector_from_string", "parameters": {"text": "example text"}}" in chat

Platform

Vold-Hal commented 5 days ago

Probably should have added - I am running Ollama locally through wsl

RogerBarreto commented 5 days ago

@Vold-Hal , using Ollama with OpenAI is a not supported scenario. Please use our Ollama connector.

Suggest also shifting to the new FunctionCalling Abstractions.

executionSettings: new OllamaPromptExecutionSettings { FunctionChoiceBehavior = FunctionChoiceBehavior.Auto() }

Let me know if that worked for you.

Thanks!

RogerBarreto commented 4 days ago

@Vold-Hal Function Invocation is currently not supported by Ollama for Streaming APIs, once this feature becomes available and is added as part of the OllamaSharp library implementation our Connector will be able to trigger function calling for streaming outputs.

See: