microsoft / semantic-kernel

Integrate cutting-edge LLM technology quickly and easily into your apps
https://aka.ms/semantic-kernel
MIT License
21.31k stars 3.13k forks source link

.Net: Bug: AzureCosmosDBNoSQLMemoryStore failed to UpsertItemAsync #8348

Open nikkla opened 2 weeks ago

nikkla commented 2 weeks ago

Describe the bug AzureCosmosDBNoSQLMemoryStore can not save Information to Azure NoSQL Cosmos DB because CosmosDb Method UpsertItemAsync() fails to execute with following message:

"One of the specified inputs is invalid"
Request URI: /apps/fd29713b-7ba0-4581-97e1-fcfd27fff740/services/25f80866-fd95-4d2d-83e1-6f1105dd2cbb/partitions/03101c5a-976d-4bcf-a84e-95c0c7d8d279/replicas/133689900035225751p/, RequestStats: Microsoft.Azure.Cosmos.Tracing.TraceData.ClientSideRequestStatisticsTraceDatum, SDK: Windows/10.0.22631 cosmos-netstandard-sdk/3.34.4

at Microsoft.Azure.Documents.StoreResult.ToResponse(RequestChargeTracker requestChargeTracker)     at Microsoft.Azure.Documents.ConsistencyWriter.WritePrivateAsync(DocumentServiceRequest request, TimeoutHelper timeout, Boolean forceRefresh)     at Microsoft.Azure.Documents.BackoffRetryUtility`1.ExecuteRetryAsync[TParam,TPolicy
](Func`1 callbackMethod, Func`3 callbackMethodWithParam, Func`2 callbackMethodWithPolicy, TParam param, IRetryPolicy retryPolicy, IRetryPolicy`1 retryPolicyWithArg, Func`1 inBackoffAlternateCallbackMethod, Func`2 inBackoffAlternateCallbackMethodWithPolicy, TimeSpan minBackoffForInBackoffCallback, CancellationToken cancellationToken, Action`1 preRetryCallback)     at Microsoft.Azure.Documents.ShouldRetryResult.ThrowIfDoneTrying(ExceptionDispatchInfo capturedException)     at Microsoft.Azure.Documents.BackoffRetryUtility`1.ExecuteRetryAsync[TParam,TPolicy
](Func`1 callbackMethod, Func`3 callbackMethodWithParam, Func`2 callbackMethodWithPolicy, TParam param, IRetryPolicy retryPolicy, IRetryPolicy`1 retryPolicyWithArg, Func`1 inBackoffAlternateCallbackMethod, Func`2 inBackoffAlternateCallbackMethodWithPolicy, TimeSpan minBackoffForInBackoffCallback, CancellationToken cancellationToken, Action`1 preRetryCallback)     at Microsoft.Azure.Documents.BackoffRetryUtility`1.ExecuteRetryAsync[TParam,TPolicy
](Func`1 callbackMethod, Func`3 callbackMethodWithParam, Func`2 callbackMethodWithPolicy, TParam param, IRetryPolicy retryPolicy, IRetryPolicy`1 retryPolicyWithArg, Func`1 inBackoffAlternateCallbackMethod, Func`2 inBackoffAlternateCallbackMethodWithPolicy, TimeSpan minBackoffForInBackoffCallback, CancellationToken cancellationToken, Action`1 preRetryCallback)     at Microsoft.Azure.Documents.ConsistencyWriter.WriteAsync(DocumentServiceRequest entity, TimeoutHelper timeout, Boolean forceRefresh, CancellationToken cancellationToken)     at Microsoft.Azure.Documents.ReplicatedResourceClient.<>c__DisplayClass32_0.<<InvokeAsync>b__0>d.MoveNext()  --- End of stack trace from previous location ---     at Microsoft.Azure.Documents.RequestRetryUtility.ProcessRequestAsync[TRequest,IRetriableResponse
](Func`1 executeAsync, Func`1 prepareRequest, IRequestRetryPolicy`2 policy, CancellationToken cancellationToken, Func`1 inBackoffAlternateCallbackMethod, Nullable`1 minBackoffForInBackoffCallback)     at Microsoft.Azure.Documents.ShouldRetryResult.ThrowIfDoneTrying(ExceptionDispatchInfo capturedException)     at Microsoft.Azure.Documents.RequestRetryUtility.ProcessRequestAsync[TRequest,IRetriableResponse
](Func`1 executeAsync, Func`1 prepareRequest, IRequestRetryPolicy`2 policy, CancellationToken cancellationToken, Func`1 inBackoffAlternateCallbackMethod, Nullable`1 minBackoffForInBackoffCallback)     at Microsoft.Azure.Documents.RequestRetryUtility.ProcessRequestAsync[TRequest,IRetriableResponse
](Func`1 executeAsync, Func`1 prepareRequest, IRequestRetryPolicy`2 policy, CancellationToken cancellationToken, Func`1 inBackoffAlternateCallbackMethod, Nullable`1 minBackoffForInBackoffCallback)     at Microsoft.Azure.Documents.StoreClient.ProcessMessageAsync(DocumentServiceRequest request, CancellationToken cancellationToken, IRetryPolicy retryPolicy)     at Microsoft.Azure.Cosmos.Handlers.TransportHandler.ProcessMessageAsync(RequestMessage request, CancellationToken cancellationToken)     at Microsoft.Azure.Cosmos.Handlers.TransportHandler.SendAsync(RequestMessage request, CancellationToken cancellationToken)

To Reproduce Steps to reproduce the behavior:

  1. Setup MemoryStore:
    
    var comsmosClient = new CosmosClient(Environment.GetEnvironmentVariable("DocumentEndpointConnectionString"));
    var embedding = new Embedding
    {
    DataType = VectorDataType.Float32,
    DistanceFunction = DistanceFunction.Cosine,
    Path = "/embedding",
    Dimensions = 1536,
    };

var indexPolicy = new IndexingPolicy { VectorIndexes = new() { new() { Path = embedding.Path, Type = VectorIndexType.QuantizedFlat } } };

new MemoryBuilder() .WithOpenAITextEmbeddingGeneration("text-embedding-3-small", Environment.GetEnvironmentVariable("openAiKey")!) .WithMemoryStore(new AzureCosmosDBNoSQLMemoryStore(comsmosClient, "memoryDb", new VectorEmbeddingPolicy([embedding]), indexPolicy)) .Build();


2. Store new Item to Memory:
```csharp
private readonly ISemanticTextMemory semanticTextMemory;

---
var content = "Some awesome content";
var id = await semanticTextMemory.SaveInformationAsync(CollectionName, content, Guid.NewGuid().ToString());

Expected behavior The Semantic Text Memory stores the information correctly without errors in the Azure NoSQL Cosmos DB.

Screenshots CosmosVectorIndex

Platform

Additional context Semantic Kernel correctly creates the container and also adds the vector index. The error only occurs when storing the information.

dmytrostruk commented 2 weeks ago

Hi @nikkla ! While we investigate this issue, I would recommend trying newly updated NoSQL connector which allows to work with your custom schema. You can try it now by downloading SK v1.17.2: https://www.nuget.org/packages/Microsoft.SemanticKernel.Connectors.AzureCosmosDBNoSQL/1.17.2-alpha

Class to operate with collections is AzureCosmosDBNoSQLVectorStoreRecordCollection: https://github.com/microsoft/semantic-kernel/blob/main/dotnet/src/Connectors/Connectors.Memory.AzureCosmosDBNoSQL/AzureCosmosDBNoSQLVectorStoreRecordCollection.cs

Would be great to receive your feedback, thanks!

nikkla commented 2 weeks ago

Hi @dmytrostruk thank you for your answer. As a mentioned in the bug description, I'm currently using the newest version of the AzureCosmosDBNoSQL which is 1.18.0-alpha referring to Nuget https://www.nuget.org/packages/Microsoft.SemanticKernel.Connectors.AzureCosmosDBNoSQL.

I'm using no custom schema, the collection was created by Semantic Kernel. I just added the vector index with semantic kernel based on the documentation I could found. Based on the error calling stack it seems like the error is not in the vector index, because the upsert failed. May Semantic Kernel not passing the correct arguments.