Closed codergautam closed 10 months ago
Hi, @codergautam! I'm Dosu, and I'm here to help the LangChain team manage their backlog. I wanted to let you know that we are marking this issue as stale.
From what I understand, you are facing an issue with the OpenAI embeddings API where the request is being set with more data than expected, resulting in an error message indicating a content length of 6000558 characters. There hasn't been any activity or comments on the issue since you posted it.
Before we close this issue, we wanted to check with you if it is still relevant to the latest version of the LangChain repository. If it is, please let us know by commenting on the issue. Otherwise, feel free to close the issue yourself or it will be automatically closed in 7 days.
Thank you for your understanding and contribution to the LangChain project!
I have a list of long pdfs (20+) which I want to use in Pinecone DB. I have used some code to convert them into .txt files.
Now here is the code that is supposed to split them into chunks and feed them into vector database.
Data is just an array with text of each pdf. I want docs to be the chunked text which should be sent to embeddings.
But I am getting an error on the call to OpenAI embeddings. It seems like the request is being set with way more text than expected. I have logged the
docs
variable but each doc or chunk is less than 2000 characters only. Then why is this much data being sent?Error: