DeabLabs / cannoli

Cannoli allows you to build and run no-code LLM scripts using the Obsidian Canvas editor.
MIT License
326 stars 22 forks source link

[FR] RAG and Ollama embedding model #30

Open wwjCMP opened 7 months ago

wwjCMP commented 7 months ago

Now using the Ollama embedding model to implement RAG in the Obsidian plugin has become quite common. I wonder if this plugin will be extended in this aspect next.

cephalization commented 7 months ago

I am not familiar, do you have any references?

wwjCMP commented 6 months ago

I hope the following content is helpful. https://github.com/ollama/ollama/blob/main/docs/api.md#generate-embeddings https://github.com/brianpetro/obsidian-smart-connections/issues/559#issuecomment-2088514981 https://github.com/logancyang/obsidian-copilot/blob/master/local_copilot.md https://github.com/your-papa/obsidian-Smart2Brain

blindmansion commented 6 months ago

Yep embeddings are something we've been thinking about implementing soon as well. We may even piggyback off of the smart connections embeddings, as they enable other plugins to use the ones it creates for a vault.

Still thinking about how to implement them in a way that makes sense for cannoli.

oyajiru commented 1 month ago

I'm not a developer, but I love using Cannoli for AI-powered workflows in Obsidian, and I am very interested in this to enhance drafting research papers and fiction.

Would it be possible to integrate a lightweight vector database like Milvus Lite into Cannoli to create on-the-fly Retrieval-Augmented Generation (RAG) databases for each workflow? This could allow users to bypass token limits and use larger language models on consumer hardware by limiting their token contexts to fit within system memory. I believe this could also help limit hallucinations.

The idea is:

  1. On-the-fly RAG databases using Milvus Lite: Would it be possible to integrate a lightweight vector database like Milvus Lite into Cannoli to create on-the-fly Retrieval-Augmented Generation (RAG) databases for each workflow? This could allow users to bypass token limits and use larger language models on consumer hardware, even with lower token contexts.

  2. Advanced data management features: a. Selective data removal: Implement a feature to remove specific obsolete information, possibly by "reversing the arrows" from the knowledge input node to the one representing the database. b. Information editing: Allow users to edit existing information by adding it to the reference node, updating the vector representations whenever the content of a knowledge node is changed.

Benefits:

Implementation considerations:

  1. I believe Milvus Lite supports complex delete expressions, which could be leveraged for selective data removal.
  2. The Python client for Milvus allows for data insertion, updating, and deletion, which could be used to implement the editing feature.

What are your thoughts on this? Would this be feasible to implement, and do you see any potential challenges or alternative approaches?

Thank you for considering these suggestions and for your fantastic work on Cannoli!