microsoft / semantic-kernel

Integrate cutting-edge LLM technology quickly and easily into your apps
https://aka.ms/semantic-kernel
MIT License
21.91k stars 3.27k forks source link

.Net Investigate OpenAI Connector with Latest Ollama OpenAI Compatible Endpoints #5327

Closed RogerBarreto closed 6 months ago

RogerBarreto commented 8 months ago

It seems that when using the same approach in the PR #4753, consuming the Ollama API breaks.

Apparently with the error below:

Microsoft.SemanticKernel.HttpOperationException: json: cannot unmarshal array into Go struct field Message.messages.content of type string
Status: 400 (Bad Request)

Content:
{"error":{"message":"json: cannot unmarshal array into Go struct field Message.messages.content of type string","type":"invalid_request_error","param":null,"code":null}}

Headers:
Date: Tue, 05 Mar 2024 21:31:57 GMT
Content-Type: application/json; charset=utf-8
Content-Length: 169

---> Azure.RequestFailedException: json: cannot unmarshal array into Go struct field Message.messages.content of type string
Status: 400 (Bad Request)

Content:
{"error":{"message":"json: cannot unmarshal array into Go struct field Message.messages.content of type string","type":"invalid_request_error","param":null,"code":null}}

Headers:
Date: Tue, 05 Mar 2024 21:31:57 GMT
Content-Type: application/json; charset=utf-8
Content-Length: 169

   at Azure.Core.HttpPipelineExtensions.ProcessMessageAsync(HttpPipeline pipeline, HttpMessage message, RequestContext requestContext, CancellationToken cancellationToken)
   at Azure.AI.OpenAI.OpenAIClient.GetChatCompletionsAsync(ChatCompletionsOptions chatCompletionsOptions, CancellationToken cancellationToken)
   at Microsoft.SemanticKernel.Connectors.OpenAI.ClientCore.RunRequestAsync[T](Func`1 request)
   --- End of inner exception stack trace ---
   at Microsoft.SemanticKernel.Connectors.OpenAI.ClientCore.RunRequestAsync[T](Func`1 request)
   at Microsoft.SemanticKernel.Connectors.OpenAI.ClientCore.GetChatMessageContentsAsync(ChatHistory chat, PromptExecutionSettings executionSettings, Kernel kernel, CancellationToken cancellationToken)
   at 

Some further investigation might be needed to understand and eventually support those new API's also in the new Ollama Connector.

luisquintanilla commented 8 months ago

Thanks for opening this issue @RogerBarreto.

Here's some code I wrote to confirm Ollama OAI API compatibility works

Source

open System
open System.Net.Http
open System.Net.Http.Json

let client = new HttpClient()

client.BaseAddress <- new Uri("http://localhost:11434")

let body = 
    {|
        model="llama2"
        messages=[|
            {|role="system" ; content="You are a helpful AI assistant that knows science facts"|}
            {|role="user" ; content="Why is the sky blue?"|}
        |]
        stream=false
    |}

let req = 
    task {
        let! res = client.PostAsJsonAsync("/v1/chat/completions", body)
        let! content = res.Content.ReadFromJsonAsync<{|choices:{| index:int; message: {|role:string;content:string|} |} array |}>()
        return content
    }

req 
|> Async.AwaitTask
|> Async.RunSynchronously
|> printfn "%A"

Response

{ choices =                                                                                                 
   [|{ index = 0                                                                                            
       message =                                                                                            
        { content =                                                                                         
           "Ah, an excellent question! The reason why the sky appears blue is due to a phenomenon called Rayleigh scattering. This is named after Lord Rayleigh, who first described the process in the late 19th century.                                                                                                          

Rayleigh scattering occurs when light from the sun travels through the Earth's atmosphere and encounters tiny molecules of gases such as nitrogen and oxygen. These molecules act like tiny mirrors, reflecting the light in all directions. The shorter wavelengths of light, such as blue and violet, are scattered more than the longer wavelengths, such as red and orange. This is why we see the sky as a blue color during the daytime.  

The reason why the Earth's atmosphere scatters light in this way is due to its composition. The air molecules are much smaller than the wavelength of light, so they can't absorb it. Instead, they bounce the light around like tiny mirrors, causing the sky to appear blue.                                                      

Interestingly, the color of the sky can also be affected by other factors such as pollution, dust, and water vapor in the atmosphere. For example, during sunrise and sunset, when the sun's rays pass through more of the Earth's atmosphere, the sky can take on hues of orange and red due to the scattering of light by larger molecules.                                                                                                   

I hope that helps you understand why the sky is blue! Is there anything else you would like to know?"       
          role = "assistant" } }|] }                                                                        
luisquintanilla commented 8 months ago

Source code that causes the failure

using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Connectors.OpenAI;
using Microsoft.SemanticKernel.ChatCompletion;
using System.Text.Json.Serialization;

var kernel =
    Kernel
        .CreateBuilder()
        .AddOllamaChatCompletion()
        .Build();

var prompt = "Why is the sky blue?";
var response = await kernel.InvokePromptAsync(prompt);

Console.WriteLine(response);

public sealed class OllamaHttpMessageHandler : HttpClientHandler
{
    private string _modelServiceUrl ="http://localhost:11434";

    protected override Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
    {
        if (request.RequestUri != null && request.RequestUri.Host.Equals("api.openai.com", StringComparison.OrdinalIgnoreCase))
        {
            request.RequestUri = new Uri($"{_modelServiceUrl}{request.RequestUri.PathAndQuery}");
        }

        return base.SendAsync(request, cancellationToken);
    }
}

public static class KernelBuilderExtensions
{
    public static IKernelBuilder AddOllamaChatCompletion(this IKernelBuilder builder)
    {
        var client = new HttpClient(new OllamaHttpMessageHandler());

        // LMStudio by default will ignore the local-api-key and local-model parameters.
        builder.AddOpenAIChatCompletion("local-model", "local-api-key", httpClient: client);
        return builder;
    }
}
JadynWong commented 8 months ago

I missed it. Seems to be related to #5337.

asabla commented 8 months ago

Seems like this has been partially been solved. At least the non-streaming example now seems to work somewhat (see out and code example below). Streaming however has some weird behavior attached to it (output is really weird), but that could also be ollama not exposing things correctly (haven't investigated this too much yet).

ollama version: 0.1.29 nuget version for Microsoft.SemanticKernel: 1.6.2

Output

image

Code

using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Connectors.OpenAI;

var builder = Kernel.CreateBuilder();

builder.AddOllamaChatCompletion(modelId: "llama2:13b");

var kernel = builder.Build();

var prompt = @"{{$input}}

One line TLDR with the fewest words.";

var summarize = kernel.CreateFunctionFromPrompt(
        promptTemplate: prompt,
        executionSettings: new OpenAIPromptExecutionSettings
        {
            MaxTokens = 100
        });

string text1 = @"
1st Law of Thermodynamics - Energy cannot be created or destroyed.
2nd Law of Thermodynamics - For a spontaneous process, the entropy of the universe increases.
3rd Law of Thermodynamics - A perfect crystal at zero Kelvin has zero entropy.";

string text2 = @"
1. An object at rest remains at rest, and an object in motion remains in motion at constant speed and in a straight line unless acted on by an unbalanced force.
2. The acceleration of an object depends on the mass of the object and the amount of force applied.
3. Whenever one object exerts a force on another object, the second object exerts an equal and opposite on the first.";

Console.WriteLine(await kernel.InvokeAsync(summarize, new() { ["input"] = text1 }));
Console.WriteLine(await kernel.InvokeAsync(summarize, new() { ["input"] = text2 }));

// --------------- Extensions and handlers -----------------------
public sealed class OllamaHttpMessageHandler : HttpClientHandler
{
    private string _modelServiceUrl = "http://localhost:11434";

    protected override Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
    {
        if (request.RequestUri != null && request.RequestUri.Host.Equals("api.openai.com", StringComparison.OrdinalIgnoreCase))
        {
            request.RequestUri = new Uri($"{_modelServiceUrl}{request.RequestUri.PathAndQuery}");
        }

        return base.SendAsync(request, cancellationToken);
    }
}

public static class KernelBuilderExtension
{
    public static IKernelBuilder AddOllamaChatCompletion(
        this IKernelBuilder builder,
        string modelId)
    {
        var httpClient = new HttpClient(new OllamaHttpMessageHandler());

        builder.AddOpenAIChatCompletion(
            modelId: modelId,
            apiKey: "ollama",
            httpClient: httpClient);

        return builder;
    }
}
// --------------- Extensions and handlers -----------------------
RogerBarreto commented 6 months ago

With the latest added support for custom Endpoints into OpenAI, this seems to be resolved.