microsoft / kernel-memory

RAG architecture: index and query any data using LLM and natural language, track sources, show citations, asynchronous memory patterns.
https://microsoft.github.io/kernel-memory
MIT License
1.52k stars 293 forks source link

[Bug] "contentType" is not populated by WriteFileAsync in MongoDbAtlasStorage #471

Closed pradeepr-roboticist closed 4 months ago

pradeepr-roboticist commented 4 months ago

Context / Scenario

I am trying use MongoDB to store chunks and files. I am using a MongoDB container.

What happened?

For testing purposes, I was able to upload a file using the following curl command. ./upload-file.sh -s http://127.0.0.1:9001 -f ../examples/001-dotnet-WebClient/file2-Wikipedia-Moon.txt -p me -t "type:notes" -t "type:test" -i "bash test"

That resulted in a document being written to the MongoDB. Screenshot from 2024-05-09 17-04-17.

As you can see, the contentType field is not included.

Though the upload-file script completed successfully, the kernel-memory container had the following error. warn: Microsoft.KernelMemory.Orchestration.RabbitMQ.RabbitMQPipeline[0] Message '(null)' processing failed with exception, putting message back in the queue System.Collections.Generic.KeyNotFoundException: Element 'contentType' not found. at Microsoft.KernelMemory.MongoDbAtlas.MongoDbAtlasStorage.ReadFileAsync(String index, String documentId, String fileName, Boolean logErrIfNotFound, CancellationToken cancellationToken) at Microsoft.KernelMemory.Pipeline.BaseOrchestrator.ReadPipelineStatusAsync(String index, String documentId, CancellationToken cancellationToken) in /src/service/Core/Pipeline/BaseOrchestrator.cs:line 156 at Microsoft.KernelMemory.Pipeline.DistributedPipelineOrchestrator.<>c__DisplayClass5_0.<<AddHandlerAsync>b__0>d.MoveNext() in /src/service/Core/Pipeline/DistributedPipelineOrchestrator.cs:line 108 --- End of stack trace from previous location --- at Microsoft.KernelMemory.Orchestration.RabbitMQ.RabbitMQPipeline.<>c__DisplayClass8_0.<<OnDequeue>b__0>d.MoveNext() in /src/extensions/RabbitMQ/RabbitMQPipeline.cs:line 115

I was expecting the kernel-memory container to go ahead and make embedding API requests but it was stalled on the above error.

Importance

I cannot use Kernel Memory

Platform, Language, Versions

No programming languages. I was just using the Docker container and hitting it with curl.

Kernel Memory tag: packages-0.50.240504.7

Platform: Ubuntu 22.04

Relevant log output

No response

dluc commented 4 months ago

Thanks for submitting the fix! šŸ‘