mmz-001 / doc-qa-tutorial

A simple document QA app built with Langchain and Streamlit
15 stars 6 forks source link

Often got errors when loading a PDF or asking questions #2

Open leo-usa opened 1 year ago

leo-usa commented 1 year ago

Hi Sasmitha,

First good work! I am very impressed! I came up the similar idea and then I found your code. Later on I found chatpdf.com. They are taking the idea to a fine product. I have tried both your code and chatpdf. Their code is quite stable. They also have some new features, for example, after loading the PDF, it has GPT to come up three questions.

The problem that I had with your code is that it often comes up some errors: sometime at the time to load the PDF, sometime when asking questions. I tried the same file in chatpdf, they have no problem. Here is an example: I have attached the PDF file and you will see the following error message when you load it. I hope you can figure out what is the problem and fix it.

Thanks!

Leo


Share

ChatDoc - The AI Bot Answering Your Questions based on a Document Upload a PDF file, then you can ask questions, our ChatGPT will answer questions based on the document

Drag and drop file here Limit 200MB per file • PDF Browse files Chris_Mack_PhD_Thesis.pdf 0.7MB

openai.error.RateLimitError: This app has encountered an error. The original error message is redacted to prevent data leaks. Full error details have been recorded in the logs (if you're on Streamlit Cloud, click on 'Manage app' in the lower right of your app). Traceback: File "/home/appuser/venv/lib/python3.9/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 565, in _run_script exec(code, module.dict) File "/app/chatdoc/app.py", line 18, in index = embed_text(parse_pdf(uploaded_file)) File "/home/appuser/venv/lib/python3.9/site-packages/streamlit/runtime/legacy_caching/caching.py", line 627, in wrapped_func return get_or_create_cached_value() File "/home/appuser/venv/lib/python3.9/site-packages/streamlit/runtime/legacy_caching/caching.py", line 611, in get_or_create_cached_value return_value = non_optional_func(*args, *kwargs) File "/app/chatdoc/utils.py", line 32, in embed_text index = FAISS.from_texts(texts, embeddings) File "/home/appuser/venv/lib/python3.9/site-packages/langchain/vectorstores/faiss.py", line 193, in from_texts embeddings = embedding.embed_documents(texts) File "/home/appuser/venv/lib/python3.9/site-packages/langchain/embeddings/openai.py", line 87, in embed_documents responses = [ File "/home/appuser/venv/lib/python3.9/site-packages/langchain/embeddings/openai.py", line 88, in self._embedding_func(text, engine=self.document_model_name) File "/home/appuser/venv/lib/python3.9/site-packages/langchain/embeddings/openai.py", line 76, in _embedding_func return self.client.create(input=[text], engine=engine)["data"][0]["embedding"] File "/home/appuser/venv/lib/python3.9/site-packages/openai/api_resources/embedding.py", line 33, in create response = super().create(args, **kwargs) File "/home/appuser/venv/lib/python3.9/site-packages/openai/api_resources/abstract/engine_apiresource.py", line 153, in create response, , api_key = requestor.request( File "/home/appuser/venv/lib/python3.9/site-packages/openai/api_requestor.py", line 226, in request resp, got_stream = self._interpret_response(result, stream) File "/home/appuser/venv/lib/python3.9/site-packages/openai/api_requestor.py", line 619, in _interpret_response self._interpret_response_line( File "/home/appuser/venv/lib/python3.9/site-packages/openai/api_requestor.py", line 679, in _interpret_response_line raise self.handle_error_response( Chris_Mack_PhD_Thesis.pdf

mmz-001 commented 1 year ago

Hey, thanks for reporting this issue. I'm probably guessing that the rate limit on the free OpenAI API key is causing the problem. You can either use a paid API key or implement some sort of retrying mechanism to fix this. Take a look at the source code for KnowledgeGPT (This is a more advanced version of doc-qa) to see how you can implement it.

leo-usa commented 1 year ago

Sasmitha,

Thanks for the quick response! I'll upgrade to the paid API key to see if this problem goes away.

I actually tried your KnowlegeGPT first. I couldn't make it work neither locally, nor through Streamlit community cloud. I posted my problem in that repository. Did you see it? If you can provide a more detailed instruction on how to clone it in Streamlit community cloud, that would be great!

Thanks!

Leo

On Tue, Mar 21, 2023 at 8:36 PM Sasmitha Manathunga < @.***> wrote:

Hey, thanks for reporting this issue. I'm probably guessing that the rate limit on the free OpenAI API key is causing the problem. You can either use a paid API key or implement some sort of retrying mechanism to fix this. Take a look at the source code for KnowledgeGPT https://github.com/mmz-001/knowledge_gpt (This is a more advanced version of doc-qa) to see how you can implement it.

— Reply to this email directly, view it on GitHub https://github.com/mmz-001/doc-qa-tutorial/issues/2#issuecomment-1478876108, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANM24YWESYIFGEL7UU5KBWDW5JXU3ANCNFSM6AAAAAAWDGAJGU . You are receiving this because you authored the thread.Message ID: @.***>

leo-usa commented 1 year ago

BTW, have you tried using the PDF file that I attached? I just tried it again, I got the same error message. Then I tried another PDF and asked a few questions, it works. That's why I suspect that PDF has some special things that caused problems in your code?

Leo

On Tue, Mar 21, 2023 at 10:14 PM Leo Pang @.***> wrote:

Sasmitha,

Thanks for the quick response! I'll upgrade to the paid API key to see if this problem goes away.

I actually tried your KnowlegeGPT first. I couldn't make it work neither locally, nor through Streamlit community cloud. I posted my problem in that repository. Did you see it? If you can provide a more detailed instruction on how to clone it in Streamlit community cloud, that would be great!

Thanks!

Leo

On Tue, Mar 21, 2023 at 8:36 PM Sasmitha Manathunga < @.***> wrote:

Hey, thanks for reporting this issue. I'm probably guessing that the rate limit on the free OpenAI API key is causing the problem. You can either use a paid API key or implement some sort of retrying mechanism to fix this. Take a look at the source code for KnowledgeGPT https://github.com/mmz-001/knowledge_gpt (This is a more advanced version of doc-qa) to see how you can implement it.

— Reply to this email directly, view it on GitHub https://github.com/mmz-001/doc-qa-tutorial/issues/2#issuecomment-1478876108, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANM24YWESYIFGEL7UU5KBWDW5JXU3ANCNFSM6AAAAAAWDGAJGU . You are receiving this because you authored the thread.Message ID: @.***>