Closed simonff closed 11 months ago
🤖
Hello @simonff! I'm Dosu, a friendly bot here to assist you with your LlamaIndex issues, answer any questions, and even guide you in becoming a contributor. While we wait for a human maintainer to chime in, I'm here to help you get started.
The "Document not found: 404 Documents does not exist" error typically occurs when the document you're trying to index doesn't exist in the specified location or the document is not properly loaded into the system.
In the provided context, you are loading documents from the directory "./data/paul_graham/" using the SimpleDirectoryReader:
reader = SimpleDirectoryReader("./data/paul_graham/")
documents = reader.load_data()
Then, you are parsing these documents into nodes and adding them to the docstore:
nodes = SentenceSplitter().get_nodes_from_documents(documents)
storage_context.docstore.add_documents(nodes)
If you're getting a "Document not found: 404 Documents does not exist" error, it could be due to one of the following reasons:
The document doesn't exist in the "./data/paul_graham/" directory. You can verify this by checking the contents of the directory.
The document is not being properly loaded by the SimpleDirectoryReader. You can verify this by checking the contents of the 'documents' variable after calling reader.load_data()
.
The nodes are not being properly added to the docstore. You can verify this by checking the contents of the docstore after calling storage_context.docstore.add_documents(nodes)
.
Please check these points and ensure that your documents are being correctly loaded and added to the docstore. You can refer to the DynamoDBDocstoreDemo.ipynb for more details on how to properly load and add documents to the docstore.
Also, please ensure that the documents you are trying to load are in a format supported by the SimpleDirectoryReader. The supported formats include HWP, PDF, DOCX, PPTX, PPT, PPTM, JPEG, PNG, JPG, MP3, MP4, CSV, EPUB, MD, MBOX, and IPYNB files. If a file's extension is not in the DEFAULT_FILE_READER_CLS
dictionary, the SimpleDirectoryReader will attempt to read the file as a standard text file. You can refer to the llama_index/readers/file/base.py for more details on the supported formats.
I hope this helps! If you have any other questions or need further clarification, feel free to ask.
-Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
If you want to continue the conversation, start your reply with @dosu-bot.
Eventually this worked - I think my service account was too new and not visible at first
Bug Description
Hi,
I'm following https://github.com/run-llama/llama_index/blob/main/docs/examples/managed/GoogleDemo.ipynb
I configured service account auth, enabled the Google API, downloaded the paul graham doc and tried to index it, but got:
The documents variable seems to be set correctly
Version
0.9.23
Steps to Reproduce
Run the example notebook
Relevant Logs/Tracbacks
No response