Open jbtelice opened 3 weeks ago
I found a similar closed issue that might be relevant to your concern:
index.vector_store.get(node.node_id)
to get the associated embedding [1].The VectorStoreIndex in the LlamaIndex documentation does not preserve embeddings within each node. Instead, it creates vector embeddings of the text of every node, which are stored separately in the vector store. This discrepancy affects querying because the embeddings are used to facilitate semantic search, not stored within the nodes themselves [2].
To address the misleading documentation, the diagrams should be updated to reflect that embeddings are stored in the vector store and not within each node. This will help clarify the actual data management and querying process. Your proposed patch seems like a step in the right direction to ensure consistency with the documentation.
To continue talking to Dosu, mention @dosu.
Hi @dosu, this is the issue I've just mentioned :-(
This is a tough nut to crack, could really use your help on this, @logan-markewich.
To continue talking to Dosu, mention @dosu.
@jbtelice story time
@logan-markewich , yep that's what I thought.
Documentation Issue Description
There are misleading diagrams in the docs, in the VectorStoreIndex section:
In this diagram, embeddings are part of the node. But, what happens under the hood when you create a VectorStoreIndex, just from nodes where each node has a pre-computed embedding?
It is expected, each node preserves the embedding, right? That's not the case. Here is some proof:
This is what really happens under the hood:
WHY IS THIS MISLEADING?
Look at the next diagram:
When you query the index, it assumes the vector_store contains all the node information. As you can check in the example provided, that's not the case:
Related info:
Fix Proposal (In terms of consistency with the docs):
Patch
EDIT : The patch is provisional, but maybe, in order to handle this scenario properly, it should avoid the duplication in other ways:
Documentation Link
https://docs.llamaindex.ai/en/stable/module_guides/indexing/index_guide/#vector-store-index