microsoft / kernel-memory

RAG architecture: index and query any data using LLM and natural language, track sources, show citations, asynchronous memory patterns.
https://microsoft.github.io/kernel-memory
MIT License
1.52k stars 293 forks source link

[Question] Is there any way to get the list of documents? #478

Closed cthlo closed 4 months ago

cthlo commented 4 months ago

Context / Scenario

I have a database of documents that I would like to "rsync" with KernelMemory on a schedule. But in order to delete a document that is no longer in the source database, I'd love to be able to check the list of imported documents in the content storage.

Question

Is there plan to support listing documents?

dluc commented 4 months ago

Currently there's no plan, we're trying to limit the scope of the storage operations to avoid turning KM into a file manager or a DB abstraction. Any new feature added on one storage (e.g. Azure Blob) would have to be implemented in all the storage extensions for consistency, which is quite expensive.

That said, the suggestion is to use the SDK of your storage choice, to access the storage and run custom operations. For instance, if you're using KM with Azure Blobs, you can leverage the Azure Storage SDK to browse the storage account, list folders and files.

cthlo commented 4 months ago

Currently there's no plan, we're trying to limit the scope of the storage operations to avoid turning KM into a file manager or a DB abstraction. Any new feature added on one storage (e.g. Azure Blob) would have to be implemented in all the storage extensions for consistency, which is quite expensive.

That said, the suggestion is to use the SDK of your storage choice, to access the storage and run custom operations. For instance, if you're using KM with Azure Blobs, you can leverage the Azure Storage SDK to browse the storage account, list folders and files.

Thanks, @dluc. I assume that will rely on the content storage structure. But it shall work for me for now. Thank you!