Closed chirantandatta closed 2 months ago
To resolve the ValueError: Query id node-0 not found in either retriever_dict or query_engine_dict
error when using embeddings a second time in ChromaDB, ensure that the query id is correctly mapped and available in the retriever_dict
or query_engine_dict
. This error typically occurs when the query id is not properly registered or has been removed.
Here are some steps to troubleshoot and resolve the issue:
Ensure Query ID Mapping:
query_id
(e.g., node-0
) is correctly mapped in the retriever_dict
or query_engine_dict
.Check Node IDs and Document Store:
node_ids
and doc_ids
are correctly set and that the docstore
contains the necessary nodes.Initialization and Usage of ChromaDB:
Here is a relevant code snippet that handles the retrieval and node mapping:
def _build_node_list_from_query_result(
self, query_result: VectorStoreQueryResult
) -> List[NodeWithScore]:
if query_result.nodes is None:
if query_result.ids is None:
raise ValueError(
"Vector store query result should return at "
"least one of nodes or ids."
)
assert isinstance(self._index.index_struct, IndexDict)
node_ids = [
self._index.index_struct.nodes_dict[idx] for idx in query_result.ids
]
nodes = self._docstore.get_nodes(node_ids)
query_result.nodes = nodes
else:
for i in range(len(query_result.nodes)):
source_node = query_result.nodes[i].source_node
if (not self._vector_store.stores_text) or (
source_node is not None and source_node.node_type != ObjectType.TEXT
):
node_id = query_result.nodes[i].node_id
if self._docstore.document_exists(node_id):
query_result.nodes[i] = self._docstore.get_node(
node_id
) # type: ignore[index]
log_vector_store_query_result(query_result)
node_with_scores: List[NodeWithScore] = []
for ind, node in enumerate(query_result.nodes):
score: Optional[float] = None
if query_result.similarities is not None:
score = query_result.similarities[ind]
node_with_scores.append(NodeWithScore(node=node, score=score))
return node_with_scores
from chromadb import HttpClient
from llama_index.legacy.vector_stores.chroma import ChromaVectorStore
# Initialize ChromaDB client and collection
client = HttpClient(host='localhost', port='8000', ssl=False)
collection = client.get_or_create_collection(name='my_collection')
# Initialize ChromaVectorStore
vector_store = ChromaVectorStore(chroma_collection=collection)
# Add nodes to the collection
nodes = [...] # List of BaseNode instances with embeddings
vector_store.add(nodes)
# Query the collection
query = VectorStoreQuery(query_embedding=[...], similarity_top_k=10)
result = vector_store.query(query)
Verify query_result.ids
:
query_result.ids
is not None
and contains valid ids.Check docstore
:
docstore
has the nodes corresponding to these ids.Check index_struct.nodes_dict
:
index_struct.nodes_dict
correctly maps the ids to node ids.By following these steps, you should be able to resolve the error and ensure that the query ids are correctly registered and available in the retriever_dict
or query_engine_dict
[1][2].
To continue talking to Dosu, mention @dosu.
Question Validation
Question
We created embeddings for some documents inside chromadb using densex but 2nd time when we are using the embeddings we are getting error like this: ValueError: Query id node-0 not found in either
retriever_dict
orquery_engine_dict
. base.py from llama_pack downloaded manually and using it here...self._proposition_llm = llm
embed_model = embed_model