Closed vecorro closed 8 months ago
Hey @vecorro, nice to see you diving into the depths of LlamaIndex again! What's brewing this time?
The ValueError: doc_id not found
error you're encountering when using the AutoMergingRetriever
in LlamaIndex version 0.10.20.post2 is likely not directly related to the code snippet you've provided for the VectorIndexAutoRetriever
class. This error typically indicates an issue with accessing or retrieving a document by its ID from the underlying data store or index.
Given the context and the code snippet, here are a few potential reasons and areas to investigate:
Document ID Misalignment: Ensure that the document IDs referenced in your queries exist in the vector store index. It's possible that the document ID being queried does not match any document in the index, leading to this error.
Index Synchronization Issues: If documents were recently added or removed from the index, ensure that the index is properly synchronized and updated. There might be a lag or issue in the synchronization process, causing discrepancies between the expected and actual document IDs in the index.
Incorrect Query Specification: Review the logic in the _parse_generated_spec
and _build_retriever_from_spec
methods to ensure that the document ID is being correctly parsed and used in queries. Any issues in how the query spec is generated or interpreted could lead to attempts to access non-existent document IDs.
Error in VectorIndexRetriever or Underlying Components: While the provided code snippet focuses on the VectorIndexAutoRetriever
, the actual retrieval and document ID resolution happen deeper in the call stack, possibly within the VectorIndexRetriever
or even further down. Review the implementation of these components to ensure they correctly handle document IDs and queries.
To resolve the issue, you should:
Without more specific details on how the AutoMergingRetriever
is implemented or how it interacts with the VectorIndexAutoRetriever
and the underlying index, these are the most likely areas to investigate based on the provided context.
For more information, you can refer to the following sources:
I hope this helps! If you have any more questions or need further clarification, feel free to ask.
The docstore is not used when using a vector db integration. I suggest populating one manually
docstore.add_documents(nodes)
docstore.add_documents(leaf_nodes)
and passing it in
AutoMergingRetriever(..., StorageContext.from_defaults(docstore=docstore)
There is a Postgres docstore as well
pip install llama-index-storage-docstore-postgres
from llama_index.storage.docstore import PostgresDocumentStore
Bug Description
I'm trying to implement an AutoMergingRetriever but when submitting a query I'm getting a
ValueError: doc_id 03ea05ed-3d9b-4edb-b4b8-43326224cf69 not found.
Version
0.10.20.post2
Steps to Reproduce
Relevant Logs/Tracbacks