enricoros / big-AGI

Generative AI suite powered by state-of-the-art models and providing advanced AI/AGI functions. It features AI personas, AGI functions, multi-model chats, text-to-image, voice, response streaming, code highlighting and execution, PDF import, presets for developers, much more. Deploy on-prem or in the cloud.
https://big-agi.com
MIT License
5.28k stars 1.2k forks source link

Feature Request: Integration of OpenAI Embeddings #101

Open argen666 opened 1 year ago

argen666 commented 1 year ago

I would like to request the integration of OpenAI embeddings into the project. As OpenAI offers powerful language models, incorporating their embeddings could significantly improve the performance and capabilities of our project. Please let me know if there are any concerns or additional requirements for implementing this feature. I am more than happy to contribute to the development and testing process.

enricoros commented 1 year ago

Hi @argen666, welcome! Please let me know which ways are you thinking to integrate embeddings; could be used on a per-chat, per-message, per-chunk level, and to enable many use cases: search, memory, context injection. First I'd like to hear from you: what would be the use case - how would you use embeddings and where would they show up in the user interface?

argen666 commented 1 year ago

Hi @enricoros, I guess the basic use case is to build a more complete research assistant trained on multiple custom documents.

The basic step-by-step guide using embeddings:

  1. Retrieve the custom papers related to the subjects you're interested in. (for example, academic papers)
  2. Compute the embeddings for each of the papers. LangChain framework can be used.
  3. Choose a platform to store the embeddings. For example, Pinecone or any of recommended by OPENAI vector database https://platform.openai.com/docs/guides/embeddings/how-can-i-retrieve-k-nearest-embedding-vectors-quickly
  4. Create an index for embeddings. An index is a data structure used to organize and search the embeddings.
  5. Upload the embeddings to Database.
  6. Once the embeddings are uploaded, we can start asking questions about the topics covered in the papers.

In our case, I think we need to add support for the vector databases listed above and add configuration for connecting to them in the application settings. This way, the user will be able to connect their own knowledge base for use. So, we just need to implement only step 6 of the above guide. Please share your thoughts on this matter. Thanks

michaelcreatesstuff commented 1 year ago

@argen666 @enricoros I have made a PR for this here, it is a decent start functionality-wise as a proof of concept

I know it could be better integrated into the current codebase and have a better UI for sure

argen666 commented 1 year ago

@michaelcreatesstuff @enricoros Great work! I also implemented this functionality in parallel with you. I'm not creating a PR yet because I'm waiting for langchainJS to add the implementation to work with Redis and other vector databases. At the moment, I also have to use Pinecone because of these limitations.

michaelcreatesstuff commented 1 year ago

@argen666 thanks. Agreed, langchainJS seems a bit behind langchain python. I'm going to try python + FastAPI

Have you tried this? It was on my list of concepts to explore https://js.langchain.com/docs/modules/indexes/vector_stores/integrations/memory but I will try python for a bit first

argen666 commented 1 year ago

@michaelcreatesstuff Thanks. I haven't tried that since I decided to focus on external vector stores to have an independent knowledge base

argen666 commented 1 year ago

@enricoros @michaelcreatesstuff Hi Team, I have made a pull request for this feature https://github.com/enricoros/big-agi/pull/122 I would appreciate any feedback. Thank you!

bbaaxx commented 8 months ago

I believe Big-AGI could benefit greatly from embeddings as this could allow for exploration of new use cases and extended functionalities for the code assistant and textual contexts.

Here is an attempt to provide a proper request description using the repo template to help continue the discussion. This of course was generated with some help from Big-AGI running GPT4(preview) and vetted by me:

Why Integrating textual embeddings into Big AGI will transform the way users interact with uploaded text files by providing a more efficient and semantically rich processing method. Instead of directly inserting text into the context window, the new feature will create embeddings that capture the essence of the text. This will enable users to perform complex language tasks on larger documents without being constrained by the context window size, leading to more accurate and context-aware responses from Big AGI.

Description This enhancement to Big-AGI will involve a transparent shift in handling uploaded text files. Upon upload, instead of placing the text into the context window, the system will generate text embeddings using a selected embedding service. These embeddings will then be used within the current conversation to maintain the flow and context. The system will be designed to support a variety of embedding services and vector databases, ensuring flexibility and extensibility. The initial implementation will focus on an in-browser vector database to provide immediate, client-side functionality without the need for server-side processing.

Requirements

(Generated with big-AGI using GPT4(1106) and vetted by the author of this post)

enricoros commented 8 months ago

Thanks for the description, clearly made by GPT-4 because it sounds good, but it's low on details.

I read when to generate and where to store. But how are the embeddings being used? Just storing them is not enough.

Is the objective to have a RAG use case? Embeddings can be used for many purposes, and I'd be curious about the top ways to use them. (Rag, MemGPT-like, etc.)

bbaaxx commented 4 months ago

I can share here my use cases here:

I hope this adds to the conversation. I would love to lend a hand to make this land on big-AGI.

lunamidori5 commented 4 months ago

@bbaaxx its next on my list! Just need to get WSL working