s-kostyaev / ellama

Ellama is a tool for interacting with large language models from Emacs.
GNU General Public License v3.0
348 stars 25 forks source link

Question: How to RAG org-roam? #91

Closed AlexRoosWork closed 3 months ago

AlexRoosWork commented 3 months ago

Hey there, stumbled upon this repo through the official ollama github and really love it!

One of my use cases would be to have a directory (and all its subdirectories) as a given context for my local LLM. In other words: RAG the ORG! Using org-roam and all the good stuff with Zettelkasten, one of my biggest issues is actually finding information in the many notes I so diligently create.

Did anybody look into how this can be done? I've only done research for now, but have not tried up setting LangChain or whatever you need for that. So maybe someone here knows something and would like to share a short guide?

Basic use-cases would be:

  1. Ask ellama about something in my org-roam files. Like "Remind me, what did I do on this day last year?" (silly example, but something like this...)
  2. Have ellama suggest changes to the org-roam files. Like adding missing back links, trimming down a node or removing stuff from a node that does not belong there (I have a daily node, where I also write down insights, would be nice to get an insight about Linux automatically refiled to the Linux.org file - or at least get notified about these and then go through them myself).

Anyway, I cant be the only one thinking about these amazing use-cases, so if you know something, say something!

(I saw the ellama-context-add-info-node in the docs, but have not tried it so far, maybe this is already it?)

s-kostyaev commented 3 months ago

Hi @AlexRoosWork Thanks for kind words

Planned to add to https://github.com/s-kostyaev/elisa (not exactly all your mentioned points, but RAG for user knowledge base). You can see elisa code if you want to try to implement it yourself

AlexRoosWork commented 3 months ago

@s-kostyaev alright! thanks for the link! :smile: I'll put it on my list to check out when I have the time for it. That is really what I love about ellama, that you can get started in like 10 minutes.

oatmealm commented 3 months ago

I was also looking into that, in fact... I understand that it's not so trivial when dealing with longer pieces of text, at least.

For example, this solution: https://github.com/embedchain/embedchain

"It efficiently segments data into manageable chunks, generates relevant embeddings, and stores them in a vector database for optimized retrieval. "

s-kostyaev commented 3 months ago

@oatmealm see open pr in elisa https://github.com/s-kostyaev/elisa/pull/12 - implementation of semantic splitting. Semantic splitting already implemented. Usage will be added later.