ML Nexus is an open-source collection of machine learning projects, covering topics like neural networks, computer vision, and NLP. Whether you're a beginner or expert, contribute, collaborate, and grow together in the world of AI. Join us to shape the future of machine learning!
Is your feature request related to a problem? Please describe.
Most large language models can only provide information based on the corpus of data that they’ve been trained on.
These models might hallucinate if they don't have the required data or context.
This is where RAG or Retrieval-Augmented-Generation helps.
By incorporating a retriever, RAG pulls relevant information from external knowledge sources, such as databases or documents, to enrich the generated output with up-to-date, contextually accurate information. This approach helps to mitigate the limitations of static model training data, enabling real-time responses that adapt to the specific needs of each query.
Describe the solution you'd like
Create a RAG based chatbot using the following technologies:
sentence-transformers/all-MiniLM-l6-v2 as sentence embedding model
RecursiveCharacterTextSplitter for chunking text
Llama3.1 as the LLM
FAISS as Vector DB
Describe alternatives you've considered
An inefficient alternate might be to manually search all the way through the text and find the answers or pass that content to a LLM and then query it.
Approach to be followed (optional)
Make a RAG model which will firstly retrieve the most relevant content from the external document.
This will be done using similarity search.
Next, this relevant content will be passed along with the user query to an LLM to frame the entire output.
Additional context
Such RAG systems are highly beneficial for research purposes, educational institutions and courses, etc.
@Neilblaze @SaiNivedh26 will you please assign me for this issue?
Thanks for creating the issue in ML-Nexus!🎉
Before you start working on your PR, please make sure to:
⭐ Star the repository if you haven't already.
Pull the latest changes to avoid any merge conflicts.
Attach before & after screenshots in your PR for clarity.
Include the issue number in your PR description for better tracking.
Don't forget to follow @UppuluriKalyani – Project Admin – for more updates!
Tag @Neilblaze,@SaiNivedh26 for assigning the issue to you.
Happy open-source contributing!☺️
Is your feature request related to a problem? Please describe. Most large language models can only provide information based on the corpus of data that they’ve been trained on. These models might hallucinate if they don't have the required data or context. This is where RAG or Retrieval-Augmented-Generation helps.
By incorporating a retriever, RAG pulls relevant information from external knowledge sources, such as databases or documents, to enrich the generated output with up-to-date, contextually accurate information. This approach helps to mitigate the limitations of static model training data, enabling real-time responses that adapt to the specific needs of each query.
Describe the solution you'd like Create a RAG based chatbot using the following technologies:
Describe alternatives you've considered An inefficient alternate might be to manually search all the way through the text and find the answers or pass that content to a LLM and then query it.
Approach to be followed (optional) Make a RAG model which will firstly retrieve the most relevant content from the external document. This will be done using similarity search. Next, this relevant content will be passed along with the user query to an LLM to frame the entire output.
Additional context Such RAG systems are highly beneficial for research purposes, educational institutions and courses, etc.
@Neilblaze @SaiNivedh26 will you please assign me for this issue?