run-llama / llama_index

LlamaIndex is a data framework for your LLM applications
https://docs.llamaindex.ai
MIT License
34.76k stars 4.9k forks source link

[Documentation]: Example shown in the Documentation throwing error. (FaissVectorStore) #13219

Closed AbiVignesh closed 2 weeks ago

AbiVignesh commented 3 months ago

Documentation Issue Description

The demo example mentioned here https://docs.llamaindex.ai/en/stable/examples/vector_stores/FaissIndexDemo/ for FaissIndex is throwing error on adding documents to the VectorStoreIndex.

My code as copied from the documentation: documents = SimpleDirectoryReader("./random_internet_data/paul_graham/").load_data() d = 1536 faiss_index = faiss.IndexFlatL2(d) vector_store = FaissVectorStore(faiss_index=faiss_index) storage_context = StorageContext.from_defaults(vector_store=vector_store) index = VectorStoreIndex.from_documents(documents, storage_context=storage_context)

Error: File ~/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/faiss/class_wrappers.py:228, in handle_Index.<locals>.replacement_add(self, x) 215 """Adds vectors to the index. 216 The index must be trained before vectors can be added to it. 217 The vectors are implicitly numbered in sequence. Whennvectors are (...) 224dtype` must be float32. 225 """ 227 n, d = x.shape --> 228 assert d == self.d 229 x = np.ascontiguousarray(x, dtype='float32') 230 self.add_c(n, swig_ptr(x))

AssertionError: `

Am I missing anything? Kindly help

Documentation Link

https://docs.llamaindex.ai/en/stable/examples/vector_stores/FaissIndexDemo/

dosubot[bot] commented 3 months ago

The error you're encountering is due to a mismatch in the dimensionality of the embeddings being added to the Faiss index (d) and the dimensionality expected by the Faiss index itself. The Faiss index you've initialized expects embeddings of dimension 1536, as specified in your code:

d = 1536  # dimension
faiss_index = faiss.IndexFlatL2(d)

The AssertionError occurs because the dimensionality (d) of the vectors you're trying to add does not match the dimensionality the Faiss index was initialized with (self.d). Faiss requires that all vectors added to an index have the same dimensionality as the index.

To fix this issue, ensure that the dimensionality of the embeddings you're adding matches the Faiss index's expected dimensionality. If the embeddings are generated from a model, verify that the model's output dimension matches what the Faiss index expects. You might need to implement a check for the embeddings' dimensionality before adding them to the index, as there doesn't seem to be a built-in functionality in LlamaIndex for this purpose.

For further details, you can refer to the source code of the FaissVectorStore class in the LlamaIndex repository.

#### Details **Sources** - [llama-index-integrations/vector_stores/llama-index-vector-stores-faiss/llama_index/vector_stores/faiss/base.py](https://github.com/run-llama/llama_index/blob/main/llama-index-integrations/vector_stores/llama-index-vector-stores-faiss/llama_index/vector_stores/faiss/base.py) - [docs/docs/module_guides/loading/simpledirectoryreader.md](https://github.com/run-llama/llama_index/blob/main/docs/docs/module_guides/loading/simpledirectoryreader.md) - [docs/docs/examples/data_connectors/simple_directory_reader.ipynb](https://github.com/run-llama/llama_index/blob/main/docs/docs/examples/data_connectors/simple_directory_reader.ipynb) - [llama-index-core/llama_index/core/storage/storage_context.py](https://github.com/run-llama/llama_index/blob/main/llama-index-core/llama_index/core/storage/storage_context.py) - [llama-index-legacy/llama_index/legacy/storage/storage_context.py](https://github.com/run-llama/llama_index/blob/main/llama-index-legacy/llama_index/legacy/storage/storage_context.py) - [llama-index-legacy/llama_index/legacy/readers/file/base.py](https://github.com/run-llama/llama_index/blob/main/llama-index-legacy/llama_index/legacy/readers/file/base.py) - [docs/docs/module_guides/loading/documents_and_nodes/usage_documents.md](https://github.com/run-llama/llama_index/blob/main/docs/docs/module_guides/loading/documents_and_nodes/usage_documents.md) - [llama-index-legacy/tests/indices/vector_store/test_faiss.py](https://github.com/run-llama/llama_index/blob/main/llama-index-legacy/tests/indices/vector_store/test_faiss.py) - [llama-index-core/llama_index/core/indices/vector_store/base.py](https://github.com/run-llama/llama_index/blob/main/llama-index-core/llama_index/core/indices/vector_store/base.py)
**About Dosu** This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a šŸ‘ if this is helpful and šŸ‘Ž if it is irrelevant.

To continue the conversation, mention @dosu.

AbiVignesh commented 3 months ago

Thanks. I copied the code and tried in my local. Looks like 384 is the dimension of the embeddings of the document.