Feature Add: RAG - Tracker

rmusser01 commented 2 months ago

Issue is to track efforts towards implementing/Improving RAG implementation.

[x] Implement Naive RAG
[x] Implement ability to ingest/import a mediawiki DB and associated needs for making it efficiently queryable.
[x] Implement ability to ingest/import Wikipedia DB.
[x] Refine existing 'Hybrid' RAG (BM25+Vectors) https://towardsdatascience.com/how-to-use-hybrid-search-for-better-llm-rag-retrieval-032f66810ebe
[x] Implement option for user-defined embeddings generation
[x] Query Rewriting
[x] Re-Ranking
[x] Continuous QnA RAG
[ ] Implement option for Ingesting + RAG for codebases
[ ] Longcite
[ ] GraphRag
[ ] Multi-Hop
[ ] VLM
[ ] RAG Chat based on prior conversation history + provided document ala sillytavern

rmusser01 commented 1 month ago

Look at implementation of OP-RAG https://arxiv.org/html/2409.01666v1

Steps to Implement OP-RAG:

    Document Preprocessing:
        Divide the long document into fixed-size chunks of 128 tokens each.

    Embedding Generation:
        Use a pre-trained model like BGE-large-en-v1.5 to generate embeddings for both the query and the text chunks.

    Similarity Calculation:
        Calculate the cosine similarity between the query and each chunk's embedding to determine relevance scores.

    Order Preservation:
        Retrieve the top k chunks based on similarity scores, but preserve the original order of the chunks as they appeared in the document.

    Token Management:
        Limit the number of tokens retrieved for context to a manageable size (e.g., 16K, 48K, etc.), depending on the model's capacity.

    Feed into Generator:
        Input the ordered, relevant chunks into the language model (e.g., Llama3.1-70B) for answer generation.

    Evaluation:
        Evaluate the quality of the generated answers using metrics such as F1 score or accuracy.

By following these steps, you can implement the OP-RAG method and potentially replicate the results they presented in the paper. You can also experiment with different chunk sizes, retrieval methods, and context lengths to optimize for your specific application.

rmusser01 commented 1 month ago

PoC for this: https://www.reddit.com/r/LangChain/comments/1dzfp48/agent_retrieval_how_we_almost_always_find_the/ https://www.reddit.com/r/LangChain/comments/1dtr49t/agent_rag_parallel_quotes_how_we_built_rag_on/

rmusser01 commented 4 weeks ago

General RAG Research https://github.com/Ancientshi/ERM4 https://github.com/QingFei1/LongRAG

rmusser01 commented 3 weeks ago

https://arxiv.org/abs/2410.07176 https://arxiv.org/abs/2410.04343

rmusser01 commented 2 weeks ago

https://github.com/Ariya12138/CORAL

rmusser01 commented 2 weeks ago

https://openragmoe.github.io/

rmusser01 commented 2 weeks ago

https://mltechniques.com/2024/10/08/building-a-ranking-system-to-enhance-prompt-results/