Closed aganiezgoda closed 4 months ago
To resolve the TypeError
when trying to persist the vector part of your property graph using ChromaVectorStore
, you need to ensure that the persist_dir
parameter is correctly specified. Here is an example of how to do it:
from llama_index.core.graph_stores import SimplePropertyGraphStore
from llama_index.vector_stores.chroma import ChromaVectorStore
import chromadb
# Initialize the Chroma client with persistence
client = chromadb.PersistentClient(path="./chroma_db")
collection = client.get_or_create_collection("my_graph_vector_db")
# Set up the ChromaVectorStore
vector_store = ChromaVectorStore(chroma_collection=collection)
# Create the PropertyGraphIndex
index = PropertyGraphIndex.from_documents(
documents,
embed_model=OpenAIEmbedding(model_name="text-embedding-3-small"),
graph_store=SimplePropertyGraphStore(),
vector_store=vector_store,
show_progress=True,
)
# Persist the storage context
index.storage_context.persist(persist_dir="./storage")
Ensure that the persist_dir
parameter is provided when initializing the PersistentClient
. This should help avoid the TypeError
related to the port
parameter being None
[1][2][3].
To continue talking to Dosu, mention @dosu.
@aganiezgoda you have a typo
vector_store = ChromaVectorStore(chroma_collection=collection)
Although, I think ChromaVectorStore
needs to be updated to handle duplicate IDs 🤔 (I got some error about that)
(chroma_collection=collection)
Do you mean (collection=collection) should be (chroma_collection=collection)?
I've tried both and it neither works.
@aganiezgoda worked fine for me here https://colab.research.google.com/drive/1tLBvXNYbX_yK_6pJNwwPb5xpFi9Z3Nsk?usp=sharing
@aganiezgoda worked fine for me here https://colab.research.google.com/drive/1tLBvXNYbX_yK_6pJNwwPb5xpFi9Z3Nsk?usp=sharing
I'm receiving:
storage_context.vector_store = vector_store
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: property 'vector_store' of 'StorageContext' object has no setter
The only difference vs. what you do is: I'm using the Azure versions of the embedding and llm model. So it's actually:
llm = AzureOpenAI(
engine="xxx",
model="gpt-35-turbo-16k",
temperature=0.0,
azure_endpoint="https://xx.openai.azure.com/",
api_key="xxxx",
api_version="2023-07-01-preview",
)
embeddings = AzureOpenAIEmbedding(
engine = "xxx",
model = "text-embedding-ada-002",
azure_endpoint="https://xxx.openai.azure.com/",
api_key="xxx",
api_version="2023-12-01-preview",
)
index = PropertyGraphIndex.from_documents(
documents,
embed_model=embeddings,
llm = llm,
graph_store=SimplePropertyGraphStore(),
vector_store=ChromaVectorStore(chroma_collection=collection),
show_progress=True,
)
It works when I don't persist the graph and vector so it should make no difference.
Update your library, this error was fixed
pip install -U llama-index-core
The update solves the issue.
Bug Description
I've been trying to persist the vector part of my property graph.
client = chromadb.HttpClient(host=host, port=port, ssl=ssl, headers=headers) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\xxx.venv\Lib\site-packages\chromadb__init__.py", line 178, in HttpClient port = int(port) ^^^^^^^^^ TypeError: int() argument must be a string, a bytes-like object or a real number, not 'NoneType'
Version
0.10.40
Steps to Reproduce
See above.
Relevant Logs/Tracbacks
No response