Open frontier-repository opened 2 weeks ago
Hello @frontier-repository! I'm here to assist you with any bugs, questions, or contributions you may have.
I found a similar closed issue that might be relevant to your problem: [Question]: Can not load my index created locally. The suggested solution in that thread was to ensure that when using a vector database, all data is persisted in the vector store alone, and there is no need to call persist or load separately. To connect to an existing index created in a vector store like Chroma, you can use the following code:
vector_store = <points to existing ChromaVectorStore>
index = VectorStoreIndex.from_vector_store(vector_store, service_context=service_context)
For your specific issue, the ValueError
you're encountering indicates that the node type could not be inferred from the persisted data. Here are the specific requirements for the load_index_from_storage
function to successfully infer the node type from the persisted data:
Storage Context: Ensure the StorageContext
is correctly initialized from the persisted directory.
storage_context = StorageContext.from_defaults(persist_dir=output_folder)
Object Node Mapping: If an object_node_mapping
is not provided, the function attempts to load it from the persisted directory. The function assumes SimpleObjectNodeMapping
for simplicity.
object_node_mapping = SimpleObjectNodeMapping.from_persist_dir(persist_dir=output_folder)
Single Index Constraint: The storage context should contain only a single index. If multiple indices are present, the function will raise a ValueError
.
Service Context: Ensure the service_context
is provided to ensure that the loaded index can operate correctly.
new_index = load_index_from_storage(storage_context=storage_context, service_context=mock_service_context)
Persistence of Object Node Mapping: The object_node_mapping
must be persisted correctly. If the persistence fails, a warning is issued, and the user must reconstruct the same object node mapping to build the ObjectIndex
.
Given these requirements, please ensure that your persisted data meets these conditions. If the issue persists, you might want to check the structure of your node data in the persisted file. The expected structure of the node data for the PropertyGraph
feature is defined by the props_schema
parameter in the get_store
function:
props_schema: str = "`file_path` STRING, `file_name` STRING, `file_type` STRING, `file_size` INT, `creation_date` STRING, `last_modified_date` STRING, `_node_content` STRING, `_node_type` STRING, `document_id` STRING, `doc_id` STRING, `ref_doc_id` STRING, `triplet_source_id` STRING"
Ensure that each node in your persisted data has the following properties:
file_path
file_name
file_type
file_size
creation_date
last_modified_date
_node_content
_node_type
document_id
doc_id
ref_doc_id
triplet_source_id
Additionally, the persisted file may include other metadata and relationships for each node, such as:
embedding
doc_hash
extra_info
node_info
relationships
__type__
By ensuring these conditions and structures, you should be able to load the persisted PropertyGraph data without encountering the ValueError
[1][2][3][4].
To continue talking to Dosu, mention @dosu.
Bug Description
Description:
I encountered an issue while trying the PropertyGraph feature in LlamaIndex with Azure OpenAI gpt-4o mini. It failed to load persisted data using the
load_index_from_storage
function after successfully persisting it with thepersist
function inStorageContext
.Expected Behavior:
The persisted PropertyGraph data should be loaded without any errors.
Environment:
Example Code:
Additional Context:
The issue appears to be related to the structure of the node data in the persisted file. Any insights or potential fixes would be greatly appreciated.
Version
0.11.4
Steps to Reproduce
Persist data using the
persist
function inStorageContext
.Attempt to load the persisted data using the
load_index_from_storage
function with the following code:Observe the error.
Relevant Logs/Tracbacks