Azure / azure-sdk-for-net

This repository is for active development of the Azure SDK for .NET. For consumers of the SDK we recommend visiting our public developer docs at https://learn.microsoft.com/dotnet/azure/ or our versioned developer docs at https://azure.github.io/azure-sdk-for-net.
MIT License
5.47k stars 4.8k forks source link

[QUERY] Using system messages and message history together with Cognitive Search extension #39443

Open johannesmols opened 1 year ago

johannesmols commented 1 year ago

Library name and version

Azure.AI.OpenAI 1.0.0-beta.8

Query/Question

I am attempting to build a service combining Azure OpenAI and Cognitive Search. I followed the example of using your own data, and I can extract my data this way just fine.

There is, however, this note at the top:

NOTE: The concurrent use of Chat Functions and Azure Chat Extensions on a single request is not yet supported. Supplying both will result in the Chat Functions information being ignored and the operation behaving as if only the Azure Chat Extensions were provided. To address this limitation, consider separating the evaluation of Chat Functions and Azure Chat Extensions across multiple requests in your solution design.

My system instructions and message history are ignored due to this. In the official Demo that uses the Python SDK, this seems to work fine. As far as I understand the note, only chat functions are supposed to be ignored, not all types of messages?

A few questions:

  1. When is it planned that this will be possible? Perhaps in the first non-beta release? If so, is there an ETA for that?
  2. What is the proper way of separating them, as the note suggests? I'm assuming I will have to query Cognitive Search myself and then add the most relevant results as a system message? If so, I will also have to call OpenAI myself to get an embedding for the search query to be able to perform vector search.
  3. Is it possible to modify the instructions given by the chat extension? If I want to, for example, have it cite sources in a different format.

For completeness, this is the code I'm using (although it is practically identically to the sample linked above):

public async Task<ChatCompletions> GetCompletions(string indexName, string prompt, IEnumerable<ChatMessage> messageHistory)
{
    var client = new OpenAIClient(_endpoint, _credential);
    var chatCompletionsOptions = new ChatCompletionsOptions
    {
        Messages =
        {
            new ChatMessage(ChatRole.System, _assistantConfig.Value.SystemInstruction)
        },
        AzureExtensionsOptions = new AzureChatExtensionsOptions
        {
            Extensions =
            {
                new AzureCognitiveSearchChatExtensionConfiguration
                {
                    SearchEndpoint = new Uri(_searchConfig.Value.Endpoint),
                    SearchKey = new AzureKeyCredential(_searchConfig.Value.Key),
                    IndexName = indexName
                }
            }
        }
    };

    foreach (var message in messageHistory)
    {
        chatCompletionsOptions.Messages.Add(message);
    }
    chatCompletionsOptions.Messages.Add(new ChatMessage(ChatRole.User, prompt));

    Response<ChatCompletions>? completions = await client.GetChatCompletionsAsync(_config.Value.ChatDeploymentName, chatCompletionsOptions);
    return completions.Value;
}

Environment

No response

sjwaight commented 1 year ago

Going to chime in and say that it appears as soon as you add the AzureCognitiveSearchChatExtensionConfiguration that the resulting GPT interactions effectively ignore some / all of the system message and user interaction context.

Interestingly the Azure OpenAI Studio does not present the same behaviour. Perhaps it is because it is tied to an older API implementation (2023-06-01-preview)?

jmemax commented 1 year ago

I too am facing this same issue. Once I enable AzureCognitiveSearchChatExtensionConfiguration the system message is ignored.

I am seeing the same behavior with Azure OpenAI Studio - working with Cognitive Search, but once we attempt to code against it with the SDK - it's ignored.

Again, also reiterating that the Demo works for me as well, and I'm merely trying to reproduce what I've done with the Demo.

adamalfredsson commented 1 year ago

FWIW, I'm experiencing the same issue using the javascript SDK @azure/openai. However, seems to work while using the sample python app. I'm noticing the system message passed as a roleInformation parameter to Azure Cognitive Search, maybe that could have an effect?

Also, I'm having this issue in the Azure OpenAI Studio as well, where the system message is ignored.

jmemax commented 1 year ago

I finally had a chance to circle back and look into this and have a work around that appears to work. The issue is as @adamalfredsson described. The serialization of AzureCognitiveSearchChatExtensionConfiguration maps out to dataSources, however, it does not include a RoleInformation property or mapping which is included in the python library that works.

Using this solution https://github.com/Azure/azure-sdk-for-net/issues/38966#issuecomment-1747916330 from @trrwilson I was able to resolve this by building a custom parameter object by including the RoleInformation in that request -

AzureExtensionsOptions = new AzureChatExtensionsOptions
{
    Extensions = {
        new AzureChatExtensionConfiguration()
        {
            Type = "AzureCognitiveSearch",
            Parameters = BinaryData.FromObjectAsJson(new
            {
                QueryType = "semantic",
                SemanticConfiguration = "default",
                Endpoint = agent.CognitiveSearchEndpoint,
                IndexName = agent.CognitiveSearchIndexName,
                Key = new AzureKeyCredential(agent.CognitiveSearchKey).Key,
                RoleInformation = agent.SystemRoleDescription
            },
            new JsonSerializerOptions() { PropertyNamingPolicy = JsonNamingPolicy.CamelCase })
        }
    }
}

It then serializes with the needed roleInformation parameter.

*edit code format

sjwaight commented 1 year ago

I finally had a chance to circle back and look into this and have a work around that appears to work. The issue is as @adamalfredsson described. The serialization of AzureCognitiveSearchChatExtensionConfiguration maps out to dataSources, however, it does not include a RoleInformation property or mapping which is included in the python library that works.

Using this solution #38966 (comment) from @trrwilson I was able to resolve this by building a custom parameter object by including the RoleInformation in that request -

AzureExtensionsOptions = new AzureChatExtensionsOptions
{
  Extensions = {
      new AzureChatExtensionConfiguration()
      {
          Type = "AzureCognitiveSearch",
          Parameters = BinaryData.FromObjectAsJson(new
          {
              QueryType = "semantic",
              SemanticConfiguration = "default",
              Endpoint = agent.CognitiveSearchEndpoint,
              IndexName = agent.CognitiveSearchIndexName,
              Key = new AzureKeyCredential(agent.CognitiveSearchKey).Key,
              RoleInformation = agent.SystemRoleDescription
          },
          new JsonSerializerOptions() { PropertyNamingPolicy = JsonNamingPolicy.CamelCase })
      }
  }
}

It then serializes with the needed roleInformation parameter.

*edit code format

Interesting. I did try this and it didn't seem to make much difference to the outcome and still limits you to 400 tokens for system message. We ended up going down a direct Semantic Search and then feeding Captions to the model to summarise. Given more time we'd revist.

tbajkacz commented 1 year ago

@sjwaight You can also try using the extensions REST API directly - that's the solution I've stuck up to until the SDK is fixed.

jmemax commented 12 months ago

Ok - found more on this. Thanks to @tbajkacz -

There is a 100 token limit on roleInformation when using the Cognitive Search based on the API documentation.

https://learn.microsoft.com/en-us/azure/ai-services/openai/reference#completions-extensions

roleInformation Gives the model instructions about how it should behave and the context it should reference when generating a response. Corresponds to the "System Message" in Azure OpenAI Studio. See Using your data for more information. There’s a 100 token limit, which counts towards the overall token limit.

@sjwaight - So sure enough, once our system message expands past 100 tokens for other agents we were also running into the same issue...

Just to confirm - I pulled down the master branch on azure sdk - and what is getting serialized and sent off matches the full payload - so what is being set in BinaryData.FromObjectAsJson appears to be correct. Once over the token limit, it fails and reverts to the default Azure AI system role.

A call to the REST API directly (suggestion from @tbajkacz) also gives the same results, where it ignores the longer RoleInformation. (for 2023-06-01-preview, 2023-07-01-preview, and 2023-08-01-preview). Shorter roles do work.

As far as the Azure Chatbot in Python working, I confirmed the value stored for AZURE_OPENAI_SYSTEM_MESSAGE matches what I am sending for a longer 100+ token length role. I've re-created the call to match the python, but still ran into the same issue... My guess is that there is a silent limit on the environment Application Settings variable when retrieved, or os.environ.get itself - which is maybe why the Python script appears to be working correctly, but really isn't.

It would be nice if that token limit on the API could be increased.

jmemax commented 10 months ago

Reporting back that the API appears to be functioning as expected now, so they must have put in a fix?

We're no longer losing Role Information on tokens over 400, and Chat History is being accepted.

Beta 12 Nuget no longer needs the custom parameters either, AzureCognitiveSearchChatExtensionConfiguration is sending over the correct values.

guillaumejay commented 10 months ago

@jmemax I'musing Beta 12 nuget, and still getting the same issue :(

jmemax commented 10 months ago

Hmm, darn - sorry to hear that! I just re-ran my postman tests and they are still working. I have a suite of tests I wrote that check token length (varying sizes 100 -> 1600), message history, etc... I've been running it with each new release, and it's failed prior. Now they work... Each one asks "Who are you?" - and if it responds with the base OpenAI message it fails.

Maybe try calling the API directly - and see if you get the same results.

https://{######}.openai.azure.com/openai/deployments/{######}/extensions/chat/completions?api-version=2023-12-01-preview

{
    "model": "{MODEL-NAME}",
    "messages": [
        {
             "role": "system",
             "content": "{ROLE-NAME}"
        },
        {
            "role": "user",
            "content": "who are you?"
        }
    ],
    "temperature": 0,
    "top_p": 0.5,
    "presence_penalty": 0,
    "frequency_penalty": 1,
    "stream": false,
    "dataSources": [
        {
            "type": "AzureCognitiveSearch",
            "parameters": {
                "endpoint": "https://{COG-SEARCH-URL}.search.windows.net",
                "key": "{KEY}",
                "indexName": "{INDEX-NAME}",
                "fieldsMapping": {
                    "contentFields": ["content"],
                    "titleField": "",
                    "urlField": "",
                    "filepathField": "",
                    "vectorFields": []
                },
                "inScope" : "true",
                "topNDocuments" : 5,
                "queryType" : "semantic",
                "semanticConfiguration" : "default",
                "roleInformation": "{ROLE-NAME}",
                "strictness" : 3
            }
        }
    ]
}

On my message history check, I've got about 15 questions and answers - user/assistant - and I ask - "What was the first question I asked about?"