Closed cgjedrem closed 8 months ago
@dmytrostruk, @awharrison-28 started working on a PR for this on the Python side. I'll share with you on the side so you can reference it.
Can you share how do you fix this problem?I run into same problem.
Hi @matthewbolanos
I'm trying to use Azure OpenAI Text Embedding Generations with Azure AI Search Memory Store as described in the example here: https://github.com/microsoft/semantic-kernel/blob/main/dotnet/samples/KernelSyntaxExamples/Example14_SemanticMemory.cs
I'm using 1.0.1 for SemanticKernel and Microsoft.SemanticKernel.Connectors.AzureAISearch, 1.0.1-alpha.
Since I use Azure AI Text Embedding, I changed line 35 from .WithOpenAITextEmbeddingGeneration("text-embedding-ada-002", TestConfiguration.OpenAI.ApiKey)
to .WithAzureOpenAITextEmbeddingGeneration(myDeployment, myEndpoint, myApiKey, myModelId)
I'm using 1.0.1 for SemanticKernel and Microsoft.SemanticKernel.Connectors.AzureAISearch, 1.0.1-alpha.
Code runs fine until I loop over the results, which generates this exception: Azure.RequestFailedException: "Unknown field 'Embedding' in vector field list.".:
I tried digging into the code by examining the connectors.UnitTests.Memory and found, that no unit tests exists for AzureAI:
I really want to use Azure AI Text Embedding and I would also like to contribute to the codebase, but how do I get started with debugging the functionality? I am not sure how to add a test for AzureAISearch to get me going.
I have the same issue, with exception
Azure.RequestFailedException Unknown field 'Embedding' in vector field list. Status: 400 (Bad Request) ErrorCode: InvalidRequestParameter
Content: {"error":{"code":"InvalidRequestParameter","message":"Unknown field 'Embedding' in vector field list.","details":[{"code":"UnknownField","message":"Unknown field 'Embedding' in vector field list."}]}}
Headers: Cache-Control: no-cache,no-store Pragma: no-cache
private async Task SearchMemoryAsync(ISemanticTextMemory memory, string query)
{
Console.WriteLine("\nQuery: " + query + "\n");
var memoryResults = memory.SearchAsync("resume-index-ai", query, limit: 2, minRelevanceScore: 0.5);
int i = 0;
await foreach (MemoryQueryResult memoryResult in memoryResults)
{
//Console.WriteLine($"Result {++i}:");
//Console.WriteLine(" URL: : " + memoryResult.Metadata.Id);
//Console.WriteLine(" Title : " + memoryResult.Metadata.Description);
//Console.WriteLine(" Relevance: " + memoryResult.Relevance);
Console.WriteLine();
}
Console.WriteLine("----------------------");
}
Within my index I have field named conentVector instead of 'Embedding' contentVector SingleCollection
How do I specify this field?
Bump
I'm also having this problem.
Thanks for reporting this issue, I will work on it immediately and will let you know as soon as it's fixed.
Hi All! The reason why it's failing with error Unknown field 'Embedding' in vector field list.
is because Azure AI Search connector is implemented using predefined schema. It works for cases when you use this connector to ingest data first (it will create an index with SK predefined schema) and then read the data using the same schema.
However, it does not cover the case when index was created in other way than SK approach (e.g. from Azure portal), because the schema may be different. After further investigation and team reviews, it appeared that it's a complex problem, which need to be fixed not only for Azure AI Search, but for other connectors as well, and it should be fixed on abstraction level.
We are going to fix this problem in scope of major refactoring for memory connectors. Meanwhile, we prepared an example how you can use Azure AI Search today with SK, by importing Azure AI Search functionality as a Plugin: https://github.com/microsoft/semantic-kernel/blob/main/dotnet/samples/KernelSyntaxExamples/Example84_AzureAISearchPlugin.cs
I am also running into the same problem, hope will get the fix sooner
Hi All! The reason why it's failing with error
Unknown field 'Embedding' in vector field list.
is because Azure AI Search connector is implemented using predefined schema. It works for cases when you use this connector to ingest data first (it will create an index with SK predefined schema) and then read the data using the same schema.However, it does not cover the case when index was created in other way than SK approach (e.g. from Azure portal), because the schema may be different. After further investigation and team reviews, it appeared that it's a complex problem, which need to be fixed not only for Azure AI Search, but for other connectors as well, and it should be fixed on abstraction level.
We are going to fix this problem in scope of major refactoring for memory connectors. Meanwhile, we prepared an example how you can use Azure AI Search today with SK, by importing Azure AI Search functionality as a Plugin: https://github.com/microsoft/semantic-kernel/blob/main/dotnet/samples/KernelSyntaxExamples/Example84_AzureAISearchPlugin.cs
This is a viable alternative to get the issue passed by. Thanks @dmytrostruk
Hi all! Can you please elaborate how to use this plugin as a solution? Should I import this plugin to the app or must I adapt the example you showed to the code? Complete noob here so any hint would be much appreciated.
Hi All! The reason why it's failing with error
Unknown field 'Embedding' in vector field list.
is because Azure AI Search connector is implemented using predefined schema. It works for cases when you use this connector to ingest data first (it will create an index with SK predefined schema) and then read the data using the same schema. However, it does not cover the case when index was created in other way than SK approach (e.g. from Azure portal), because the schema may be different. After further investigation and team reviews, it appeared that it's a complex problem, which need to be fixed not only for Azure AI Search, but for other connectors as well, and it should be fixed on abstraction level. We are going to fix this problem in scope of major refactoring for memory connectors. Meanwhile, we prepared an example how you can use Azure AI Search today with SK, by importing Azure AI Search functionality as a Plugin: https://github.com/microsoft/semantic-kernel/blob/main/dotnet/samples/KernelSyntaxExamples/Example84_AzureAISearchPlugin.csThis is a viable alternative to get the issue passed by. Thanks @dmytrostruk
Should I import this plugin to the app or must I adapt the example you showed to the code?
@nunomsr If you configured your index in Azure AI Search and you have similar problem with predefined schema and Embedding
field, it's better to use the code which is provided in example above, so it will allow you to use custom schema and bypass current limitations. Let me know if any further assistance is needed. Thanks!
@dmytrostruk the example link that u shared above => https://github.com/microsoft/semantic-kernel/blob/main/dotnet/samples/KernelSyntaxExamples/Example84_AzureAISearchPlugin.cs. This is not functional anymore. Can you please provide a reference or an example for this issue. Thanks
@dmytrostruk the example link that u shared above => https://github.com/microsoft/semantic-kernel/blob/main/dotnet/samples/KernelSyntaxExamples/Example84_AzureAISearchPlugin.cs. This is not functional anymore. Can you please provide a reference or an example for this issue. Thanks
Here is a new link: https://github.com/microsoft/semantic-kernel/blob/c545c7d774176d11964c81e173776232a2ae2f20/dotnet/samples/Concepts/Search/MyAzureAISearchPlugin.cs
@dmytrostruk any update on the feature enhancement needed to allow specifying a matching schema to what Azure AI Search natively uses when ingesting and embedding data?
I just ran into this issue and had to debug to come to the same conclusion as previous posters that SK creates Embedding field and AI Search creates contentVector. Based on this I assume that your preferred use of SK is to do the file loading, chunking and embedding from SK itself. Is this method comparable in quality to using AI Search chunking / embedding?
any update on the feature enhancement needed to allow specifying a matching schema to what Azure AI Search natively uses when ingesting and embedding data?
@snympi We are working on new design for vector abstractions that will allow to use any schema, including the one that Azure AI Search natively uses when ingesting and embedding data. By following link you can find new Azure AI Search implementation in our feature branch.
Based on this I assume that your preferred use of SK is to do the file loading, chunking and embedding from SK itself.
With new design it will be possible to do file loading/chunking/embedding from SK or use already existing index from Azure AI Search to query data only.
Is this method comparable in quality to using AI Search chunking / embedding?
Integrated chunking in Azure AI Search allows you to chunk your documents by specific rules (e.g. pages or sentences), the same applies to SK version of TextChunker. I'm not sure if there are huge differences in terms of quality. As for embeddings, you can choose AI model you would like to use both on Azure AI Search and from code using Semantic Kernel, and both approaches should produce the same outcome.
I think that doing chunking and embedding from the code should provide you more flexibility and control. In the code you can always provide your custom chunking logic based on nature of your documents or use local/custom AI model for embedding generation if needed.
Describe the bug
During Azure cognitive search search connection and search in the collection the vector embedding name cannot be configured. In Azure Cognitive Search the Embeddings Name is "contentVector" while in Semantic Kernel the embedding vector name is set to
public const string EmbeddingField = "Embedding";
from namespace Microsoft.SemanticKernel.Connectors.Memory.AzureCognitiveSearch; AzureCognitiveSearchMemoryStore line 167 GetNearestMatchesMy code
which on all returns an error
Expected behavior I expect to be able to set the vector field list name in the configuration of the memory store so that the search goes as expected
Screenshots
Platform