Government Document Chatbot using langchain

Hi @amit-s19, I am a 2nd year BTech student and here is my solution to the problem :-

I am using Langchain and Gemini-pro to achieve the following results:

Result

chrome_oAcmNTpTRn

The steps are as follows:

Using PyPDF2 to read the pdf document
Using langchain.text_splitter to split the text written in the pdf
Using GoogleGenerativeAIEmbeddings for creating text embeddings
I have used FAISS as a vectorstore but we can easily replace that with ChromaDB or Pinecone.
I have used a basic prompt template and question_answering chain to talk to the PDF data
Using Streamlit to create a simple user interface.

The reason why I am using gemini-pro is because it has multi-language support. later we can replace it with our own model fine-tuned in hindi or any other language.

we can use gemini-visison-pro to read the image data inside the pdf can convert it into text, which can later be used for question answering

District-Administration-Varanasi / document-chatbot

Government Document Chatbot using langchain #3

Result