Open argen666 opened 1 year ago
Hi @argen666, welcome! Please let me know which ways are you thinking to integrate embeddings; could be used on a per-chat, per-message, per-chunk level, and to enable many use cases: search, memory, context injection. First I'd like to hear from you: what would be the use case - how would you use embeddings and where would they show up in the user interface?
Hi @enricoros, I guess the basic use case is to build a more complete research assistant trained on multiple custom documents.
The basic step-by-step guide using embeddings:
In our case, I think we need to add support for the vector databases listed above and add configuration for connecting to them in the application settings. This way, the user will be able to connect their own knowledge base for use. So, we just need to implement only step 6 of the above guide. Please share your thoughts on this matter. Thanks
@argen666 @enricoros I have made a PR for this here, it is a decent start functionality-wise as a proof of concept
I know it could be better integrated into the current codebase and have a better UI for sure
@michaelcreatesstuff @enricoros Great work! I also implemented this functionality in parallel with you. I'm not creating a PR yet because I'm waiting for langchainJS to add the implementation to work with Redis and other vector databases. At the moment, I also have to use Pinecone because of these limitations.
@argen666 thanks. Agreed, langchainJS seems a bit behind langchain python. I'm going to try python + FastAPI
Have you tried this? It was on my list of concepts to explore https://js.langchain.com/docs/modules/indexes/vector_stores/integrations/memory but I will try python for a bit first
@michaelcreatesstuff Thanks. I haven't tried that since I decided to focus on external vector stores to have an independent knowledge base
@enricoros @michaelcreatesstuff Hi Team, I have made a pull request for this feature https://github.com/enricoros/big-agi/pull/122 I would appreciate any feedback. Thank you!
I believe Big-AGI could benefit greatly from embeddings as this could allow for exploration of new use cases and extended functionalities for the code assistant and textual contexts.
Here is an attempt to provide a proper request description using the repo template to help continue the discussion. This of course was generated with some help from Big-AGI running GPT4(preview) and vetted by me:
Why Integrating textual embeddings into Big AGI will transform the way users interact with uploaded text files by providing a more efficient and semantically rich processing method. Instead of directly inserting text into the context window, the new feature will create embeddings that capture the essence of the text. This will enable users to perform complex language tasks on larger documents without being constrained by the context window size, leading to more accurate and context-aware responses from Big AGI.
Description This enhancement to Big-AGI will involve a transparent shift in handling uploaded text files. Upon upload, instead of placing the text into the context window, the system will generate text embeddings using a selected embedding service. These embeddings will then be used within the current conversation to maintain the flow and context. The system will be designed to support a variety of embedding services and vector databases, ensuring flexibility and extensibility. The initial implementation will focus on an in-browser vector database to provide immediate, client-side functionality without the need for server-side processing.
Requirements
(Generated with big-AGI using GPT4(1106) and vetted by the author of this post)
Thanks for the description, clearly made by GPT-4 because it sounds good, but it's low on details.
I read when to generate and where to store. But how are the embeddings being used? Just storing them is not enough.
Is the objective to have a RAG use case? Embeddings can be used for many purposes, and I'd be curious about the top ways to use them. (Rag, MemGPT-like, etc.)
I can share here my use cases here:
Semantic search of relevant data over a colection of documents -- I would like to be able to have multiple collections of documents to select which collection to chat with (for example all my financial documents in one collection or "workspace", my health-related documents on another and so on)
Semantic search over previous chat conversations or summarization of memory to have a bot "learn" as we interact. -- A personas-generated agent could be the "narrator" responsible for summarization of chat conversations which will then be stored as vector embeddings and then used on future conversations.
As to where to store the vectors: ... well that is a difficult one because there are only a couple vector-db for the browser (This is the only one I know about) -- but IMHO vectors should be stored on a more "persistenty" and user-governed database which could probably be enabled by adding integrations ... the fun part with this is the fact that the list of vector-db servicess is quite large (maybe leverage langchain-js?). -- (maybe we can start with some OpenSource integrations like Chroma and we can ask @lunamidori5 to add that to their installer 😉 )
I hope this adds to the conversation. I would love to lend a hand to make this land on big-AGI.
@bbaaxx its next on my list! Just need to get WSL working
I would like to request the integration of OpenAI embeddings into the project. As OpenAI offers powerful language models, incorporating their embeddings could significantly improve the performance and capabilities of our project. Please let me know if there are any concerns or additional requirements for implementing this feature. I am more than happy to contribute to the development and testing process.