Kav-K / GPTDiscord

A robust, all-in-one GPT interface for Discord. ChatGPT-style conversations, image generation, AI-moderation, custom indexes/knowledgebase, youtube summarizer, and more!
MIT License
1.81k stars 305 forks source link

Semantic Search in conversation history using pinecone db #17

Closed Kav-K closed 1 year ago

Kav-K commented 1 year ago

Alongside summarizations, we want to embed summarizations and save them inside pinecone. Then, when users send prompts within a conversation to the bot, we want to search pinecone's vectors for the most similar embeddings closest to the user prompt. We then append this found context to the prompt before sending to GPT3. This, effectively simulates long and permanent term memory.

Of course, there are tons of things to think about, such as the "forget" conditions (the conditions upon which embeddings should be removed as they are deemed irrelevant, just like human brains), and then "save" conditions (when and depending on what policy do we store embeddings as permanent data, also like the human brain, we need to choose a time to consolidate information and filter and policy that information.

Kav-K commented 1 year ago

https://www.pinecone.io/semantic-search

Kav-K commented 1 year ago

https://docs.pinecone.io/docs/extractive-question-answering This notebook demonstrates how Pinecone helps you build an extractive question-answering application. To build an extractive question-answering system, we need three main components:

A vector index to store and run semantic search A retriever model for embedding context passages A reader model to extract answers

Kav-K commented 1 year ago

Note that we want to use OpenAI ADA-002 embeddings, NOT SENTENCE-TRANSFORMERS

Kav-K commented 1 year ago

Sorry for the delays on this! This will be fully done and ready for alpha by January 3rd.

Kav-K commented 1 year ago

An alpha version of this is done in v4.0!