run-llama / rags

Build ChatGPT over your data, all with natural language
MIT License
6.17k stars 629 forks source link

Stop response generation in langchain framework #65

Open Aniketparab1999 opened 5 months ago

Aniketparab1999 commented 5 months ago

Python code: qa_chain = RetrievalQA.from_chain_type(llm=turbo_llm, chain_type="stuff", retriever=compression_retriever, return_source_documents=True ) response = qa_chain("What is Langchain?")

This is the python code I am using to query over a PDF by following RAG approach. My requirement is, if it takes more than 1 minute to generate the response then it should stop response generation from the backend. How I can do that? Is there any python code architecture available for this?