Azure / azure-sdk-for-net

This repository is for active development of the Azure SDK for .NET. For consumers of the SDK we recommend visiting our public developer docs at https://learn.microsoft.com/dotnet/azure/ or our versioned developer docs at https://azure.github.io/azure-sdk-for-net.
MIT License
5.19k stars 4.55k forks source link

[BUG] Missing search scores on citations when using Azure Search with ChatCompletions and AzureCognitiveSearchChatExtensionConfiguration #42135

Open grankko opened 5 months ago

grankko commented 5 months ago

Library name and version

Azure.AI.OpenAI 1.0.0-beta.13

Describe the bug

We're working on a chat bot that uses a GPT model in an Azure OpenAI service with Retrieval Augmented Generation to access internal domain data residing in an Azure Search index. We're using vector search with the ada-002 embedding model.

We use the AzureCognitiveSearchChatExtensionConfiguration object on ChatCompletionsOptions to enable the LLM to access our Azure Search instance.

When we get a chat completion response we are lacking information about search scores for each citation. In this example you've documented that search scores are included when using the REST endpoint.

In our scenario, even when we get Citations in the response, the metadata object is lacking and we can't see any search scores. We're trying to understand what citation is most relevant to improve our UI.

Expected behavior

Citation metadata object populated with search scores, as in the the Microsoft documentation: https://learn.microsoft.com/en-us/azure/ai-services/openai/reference#example-response-3

Actual behavior

No metadata returned with the citations.

Reproduction Steps

  1. Setup an Azure Search index, populate text and vector fields (using ada-002 embeddings)
  2. Setup a Azure OpenAI chat solution using AzureCognitiveSearchChatExtensionConfiguration to search in the Azure Search index.
  3. Trigger a chat completion that get's a citation and see that the metadata is not populated in the response from the Azure OpenAI service (missing search scores for citations).

Environment

.net 8.0 Windows 11

jsquire commented 5 months ago

Thank you for your feedback. Tagging and routing to the team best able to assist.

//cc: @trrwilson

grankko commented 4 months ago

Looks like this was not addressed in the beta.14 version of the package, where the Citations are now a typed object instead of a json blob.

@jsquire - any indication if the search score will be included in that object or not would be appreciated.

jsquire commented 4 months ago

@grankko: Aside from triaging to the correct owners, I have no insight into the request nor its current state. You'd want to ask @trrwilson and team.

trrwilson commented 4 months ago

Thank you for reporting this! The most recent Azure OpenAI service versions introduced structured representation of On Your Data citation information (vs. an opaque JSON document in "content"), but it appears that metadata have been missed in the strongly-typed schematization here: https://github.com/Azure/azure-rest-api-specs/blob/d4168119adedb41c10321a12cad0d0ba37a77cfe/specification/cognitiveservices/data-plane/AzureOpenAI/inference/preview/2024-02-15-preview/inference.json#L4593

This SDK reflects that potential omission. I'm following up with the service team to check on this; assuming it's missing from the spec, we'll correct that and then propagate the update to all impacted libraries (including this one).

trrwilson commented 4 months ago

I talked with the On Your Data feature team and was informed that this 'metadata' field was a previously unintentional inclusion in the response payloads -- one that's since been removed on contemporary deployments and thus can't have any representation (structured or otherwise) in API/SDK surfaces.

It's understood that these scores can be important and there's active discussion about more formally (and with clearer detail) exposing the scores in the structure of the citation data. No ETA is yet available for that, but it's scheduled for an internal review next week.

grankko commented 4 months ago

Thank you for looking into this, we really appreciate it.

When we started our initiative we experimented first with creating the search feature as a hand rolled ChatFunction. This gave us more control (like being able to access metadata such as search scores). We've since moved over to the OnYourData-way, which is more of a black box but also nice not having to build and maintain that part ourselves.