mayooear / gpt4-pdf-chatbot-langchain

GPT4 & LangChain Chatbot for large PDF docs
https://www.youtube.com/watch?v=ih9PBGVVOO4
14.73k stars 3k forks source link

”Failed to ingest your data“ #418

Closed Tilie closed 4 months ago

Tilie commented 8 months ago

I am trying to use the local vector database chroma, but I have encountered many problems. Can anyone give me some advice?

creating vector store... error RequiredError: Required parameter collectionName was null or undefined when calling getCollection. at Object.getCollection (/Users/jinyi/gpt4-pdf-chatbot-langchain/node_modules/chromadb/dist/module/generated/api.js:240:23) at Object.getCollection (/Users/jinyi/gpt4-pdf-chatbot-langchain/node_modules/chromadb/dist/module/generated/api.js:723:78) at ApiApi.getCollection (/Users/jinyi/gpt4-pdf-chatbot-langchain/node_modules/chromadb/dist/module/generated/api.js:1045:45) at ChromaClient.getCollection (/Users/jinyi/gpt4-pdf-chatbot-langchain/node_modules/chromadb/dist/module/ChromaClient.js:176:14) at Chroma.addVectors (/Users/jinyi/gpt4-pdf-chatbot-langchain/node_modules/langchain/dist/vectorstores/chroma.js:76:45) at process.processTicksAndRejections (node:internal/process/task_queues:95:5) at Chroma.addDocuments (/Users/jinyi/gpt4-pdf-chatbot-langchain/node_modules/langchain/dist/vectorstores/chroma.js:39:9) at Function.fromDocuments (/Users/jinyi/gpt4-pdf-chatbot-langchain/node_modules/langchain/dist/vectorstores/chroma.js:120:9) at run (/Users/jinyi/gpt4-pdf-chatbot-langchain/scripts/ingest-data.ts:42:7) at (/Users/jinyi/gpt4-pdf-chatbot-langchain/scripts/ingest-data.ts:56:3) { field: 'collectionName' } /Users/jinyi/gpt4-pdf-chatbot-langchain/scripts/ingest-data.ts:51 throw new Error('Failed to ingest your data'); ^ Error: Failed to ingest your data at run (/Users/jinyi/gpt4-pdf-chatbot-langchain/scripts/ingest-data.ts:51:11) at process.processTicksAndRejections (node:internal/process/task_queues:95:5) at (/Users/jinyi/gpt4-pdf-chatbot-langchain/scripts/ingest-data.ts:56:3)

icebearlabs commented 8 months ago

I'm running into the same issue. Pinecone env variables are set, had to set PINECONE_NAME_SPACE to PINECONE_NAME_SPACE="" to get to this point.

I'm also seeing this error:

error [PineconeError: PineconeClient: Error calling upsert: PineconeError: metadata size is 68702 bytes, which exceeds the limit of 40960 bytes per vector] /Users/icebearlabs/projects/private/abschlussarbeit-research-bot/scripts/ingest-data.ts:46 throw new Error('Failed to ingest your data');

Tilie commented 8 months ago

I'm running into the same issue. Pinecone env variables are set, had to set PINECONE_NAME_SPACE to PINECONE_NAME_SPACE="" to get to this point.

I'm also seeing this error:

error [PineconeError: PineconeClient: Error calling upsert: PineconeError: metadata size is 68702 bytes, which exceeds the limit of 40960 bytes per vector] /Users/icebearlabs/projects/private/abschlussarbeit-research-bot/scripts/ingest-data.ts:46 throw new Error('Failed to ingest your data');

Indeed, the code will display "Failed to ingest your data" whenever it comes across any pertinent issues. Based on your log information, I believe that the root of our errors differs.

icebearlabs commented 8 months ago

For me the solution was to get a paid pinecone account. With the free starter, it just doesn't work.

dosubot[bot] commented 5 months ago

Hi, @Tilie,

I'm helping the gpt4-pdf-chatbot-langchain team manage their backlog and am marking this issue as stale. It seems like you encountered an error message "Failed to ingest your data" while trying to use the local vector database chroma, and there was a discussion with another user, icebearlabs, who reported a similar issue. It was suggested that getting a paid Pinecone account resolved the error for icebearlabs.

Could you please confirm if this issue is still relevant to the latest version of the gpt4-pdf-chatbot-langchain repository? If it is, please let the gpt4-pdf-chatbot-langchain team know by commenting on the issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days.

Thank you!