Feature Request: Memory via Embeddings

NeauraNightsong commented 1 year ago

Would it be possible to add a memory using embeddings, so that .txt files or documents could be loaded, converted to .json embeddings and then referenced by the AI before responding to prompts? would be willing to pay for this feature, as don't have the skill/knowledge to implement myself. <3

CodeAKrome commented 1 year ago

LocalGPT does this if I understand you.

If not I might be able to write a custom version or itigrate it here. I've already reviewed the code on this repo and it looks doable and I already worked with LocalGPT, which uses chroma.

More ambitiously, I've been looking at using Flair embeddings and Zep. I've been heavily into NER/NERD of late for my own nefarious purposes with rss news feeds and doing a front end like this. I was thinking entity based hybrid vector search and have looked at using flair embeddings with deep lake. Context windows could be compressed using PEGASUS for summarization. I have proof of concept code for all this jazz. No integration yet.

NeauraNightsong commented 1 year ago

Zep seems really cool, I've seen a bunch of projects that go about adding memory in different ways, Zep seems really robust. I think some kind of memory system like this or otherwise would be amazing for your project as such a capable system that can code on the pc and have a long term memory? I haven't seen that done yet, only one or the other. Even if it was only able to tokenize and convert a text document into a json and use that as memory / supporting document it would be insanely useful. I'm quite a noob/beginner though so I'm sorry if anything I say doesn't make sense or is vague! 💕

On Fri, Sept 8, 2023, 4:03 p.m. CodeAKrome @.***> wrote:

Is this what you mean: Zep https://github.com/getzep/zep? If not I might be able to write a custom version. I've been looking at using Flair embeddings https://flairnlp.github.io/docs/tutorial-embeddings/flair-embeddings. I've been heavily into NER/NERD of late for my own nefarious purposes with rss news feeds and doing a front end like this.

— Reply to this email directly, view it on GitHub https://github.com/KillianLucas/open-interpreter/issues/149#issuecomment-1712164246, or unsubscribe https://github.com/notifications/unsubscribe-auth/A5X3AGMU3FEDPQOKVS5KLELXZN2YLANCNFSM6AAAAAA4PSPT6U . You are receiving this because you authored the thread.Message ID: @.***>

rkeshwani commented 1 year ago

You could integrate this with langchain and allow for some form of configuration to let the user decide? You have Chroma, Weaviate, Pinecone, Zep, pg_vector, faiss, etc for different vector stores. Or even json files. Huggingface also has models that can run locally to vectorize files/queries. The tricky part is that your stored vectors must match your loaded embedding model which is separate from your interfacing model (like openai's models or code llama). So you have to keep track of that as well which makes implementing this complicated.

rkeshwani commented 1 year ago

You could integrate this with langchain and allow for some form of configuration to let the user decide? You have Chroma, Weaviate, Pinecone, Zep, pg_vector, faiss, etc for different vector stores. Or even json files. Huggingface also has models that can run locally to vectorize files/queries. The tricky part is that your stored vectors must match your loaded embedding model which is separate from your interfacing model (like openai's models or code llama). So you have to keep track of that as well which makes implementing this complicated.

A potential solve for this is to build something that could load configurations from something like https://flowiseai.com/. Then you add a configuration from flowise as a parameter.

CodeAKrome commented 1 year ago

Just adding some basic examples is simple enough. I can port the code from LocalGPT. I want it anyway and I've forked the repo already. I've read through LocalGPT plenty and issued a minor PR or 2 there.
Getting the system to ingest on demand would take my looking at how the tool runs. It's far simpler to explicitly load the datastore then query it by default.
Flowise looks cool. I have been studying this stuff for months now and can do it by hand, but faster is better. I'll try and make it go and report back. This is a huge subject. My obsidian files multiply.
Exactly my thoughts. I was going to ask if anyone would object if I used langchain. I was also thinking of using a lightweight prompt templating system which matches prompt style to model. I need that myself to use different prompts for different tasks.

rkeshwani commented 1 year ago

Apparently https://github.com/Helicone/helicone implemented an api around this.

CodeAKrome commented 1 year ago

Langflow looks similar to Flowise. Checking it out too.

Helicone looks vurra nice for monitoring.
Flowise looks doable. I looked at all the deployments. Huggingface and render look free for net.
Still wading though dox.

CodeAKrome commented 1 year ago

Langflow is no go, for now:

Python exception: ModuleNotFoundError: No module named 'pandas.core.arrays.arrow.dtype'

pandas.ArrowDtype
class pandas.ArrowDtype(pyarrow_dtype)[[source]](https://github.com/pandas-dev/pandas/blob/v2.1.0/pandas/core/dtypes/dtypes.py#L2000-L2298)
An ExtensionDtype for PyArrow data types.

Warning

ArrowDtype is considered experimental. The implementation and parts of the API may change without warning.

pip install pyarrow
Requirement already satisfied: pyarrow in ./.venv/lib/python3.10/site-packages (12.0.1)
Requirement already satisfied: numpy>=1.16.6 in ./.venv/lib/python3.10/site-packages (from pyarrow) (1.25.2)

Looks like it may have changed.

CodeAKrome commented 1 year ago

Flowise has authentication issues when I tried with chroma following the special "you're running flowise and chroma in docker on the same host" procedure. If they get rid of the issues, they look great.

CodeAKrome commented 1 year ago

So to get domain specific context:

Insert a hook in the chat loop looking for /^slurp:(.*)/ which triggers ingestion to vectorstore.
Add middleware into the loop which looks for file like strings and tries to eat them.
Run a scanner which loads them beforehand.
Profit!

I'm thinking scanner since you presumably know the files in which you posses interest and just hit the vector store.

Which means:

Do it all the time
Have a flag to enable such a thing

Moving this to discord feature-ideas-forum. In the mean time I'll see about porting from LocalGPT on my fork.

grexzen commented 1 year ago

Just adding some basic examples is simple enough. I can port the code from LocalGPT. I want it anyway and I've forked the repo already. I've read through LocalGPT plenty and issued a minor PR or 2 there.

Getting the system to ingest on demand would take my looking at how the tool runs. It's far simpler to explicitly load the datastore then query it by default.

Flowise looks cool. I have been studying this stuff for months now and can do it by hand, but faster is better. I'll try and make it go and report back. This is a huge subject. My obsidian files multiply.

Exactly my thoughts. I was going to ask if anyone would object if I used langchain. I was also thinking of using a lightweight prompt templating system which matches prompt style to model. I need that myself to use different prompts for different tasks.

Langchain sometimes is a pain in the butt to work with. I used to use it for backend stuff, but honestly, it is easier just have functions per tool or action and an api.

ed1g1tal commented 1 year ago

Now if we could have a list of prior sessions/chats that we could select from to hop back into (resume) along with this would be pretty nifty. I could integrate that into the Streamlit frontend web UI and just list the past chat in the sidebar to obtain a ChatGPT-like experience.

NeauraNightsong commented 1 year ago

I really like how this project does it, where documents can be converted, saved, loaded, etc: https://github.com/LagPixelLOL/ChatGPTCLIBot

jarulraj commented 1 year ago

Hey @NeauraNightsong @CodeAKrome @rkeshwani @grexzen -- We would love to do an integration with EvaDB to store the embeddings in a local FAISS index. EvaDB supports multiple backend vector indexes, prompt caching, text summarization etc.

https://github.com/georgia-tech-db/evadb https://github.com/georgia-tech-db/evadb/blob/staging/apps/privategpt/privateGPT.py

Your thoughts?

vinodvarma24 commented 1 year ago

Custom memory with embeddings is very important to have If we want to use a couple of PDF files with custom knowledge.

We can divide this into two steps:

First create embeddings on whatever the docs we need and store them in a vector store
And the vector store connection details to the Openinterpreter config file, with some description of knowledgebase. so that OI can query it based on the user's question. (similar to how langchain implements)

@KillianLucas Any thoughts on this?

ericrallen commented 11 months ago

Hey there, folks!

I'm going to close this one as a duplicate so we can consolidate all of the Vector, Memory, RAG, storage, etc. discussions into the first issue that brought them up (#144) and reduce the overall noise in the projects' Issues.

OpenInterpreter / open-interpreter

Feature Request: Memory via Embeddings #149