Closed RazeBerry closed 1 year ago
Hi, @RazeBerry! I'm Dosu, and I'm helping the gpt4-pdf-chatbot-langchain team manage their backlog. I wanted to let you know that we are marking this issue as stale.
Based on my understanding, you are experiencing an issue with the Pinecone Index where it is unable to read the full content of JSTOR articles. You mentioned that you were seeking a workaround and asked for others to test it on their machines. However, there hasn't been any further activity or comments on the issue.
Before we close this issue, we wanted to check with you if it is still relevant to the latest version of the gpt4-pdf-chatbot-langchain repository. If it is, please let us know by commenting on the issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days.
Thank you for your understanding and contribution to the gpt4-pdf-chatbot-langchain project!
Problem: Pincone's Index seems unable to read JSTOR article. For example Source 1 to 4 would be all stuck with the following and failed to read the rest of the article and only the header despite the entire PDF is OCRed correctly and formatted correctly. Is there any workaround for this? " conglomeration of small nation-states threatened to undermine the This content downloaded from �����������IP ADDRESS on Wed, 03 May 2023 18:59:57 +00:00����������� All use subject to https://about.jstor.org/terms" I am not sure if it replicates across machines but it would be great someone else see if it works for them!