Open nuocheng opened 1 month ago
To resolve the issue of the generated knowledge graph not having edge relationships, ensure that the _prune_invalid_triplets
method is correctly validating and adding the triplets. Here is a snippet of the relevant code that handles this:
def _prune_invalid_triplets(self, kg_schema: Any) -> List[Triplet]:
"""Prune invalid triplets."""
assert isinstance(kg_schema, self.kg_schema_cls)
valid_triplets = []
for triplet in kg_schema.triplets:
subject = triplet.subject.name
subject_type = triplet.subject.type
relation = triplet.relation.type
obj = triplet.object.name
obj_type = triplet.object.type
# Check if the triplet is valid based on the schema format
if (
isinstance(self.kg_validation_schema, dict)
and "relationships" in self.kg_validation_schema
):
# Schema is a dictionary with a 'relationships' key and triples as values
if (subject_type, relation, obj_type) not in self.kg_validation_schema[
"relationships"
]:
continue
else:
# Schema is the backwards-compat format
if relation not in self.kg_validation_schema.get(
subject_type, [relation]
) and relation not in self.kg_validation_schema.get(
obj_type, [relation]
):
continue
# Remove self-references
if subject.lower() == obj.lower():
continue
subj_node = EntityNode(label=subject_type, name=subject)
obj_node = EntityNode(label=obj_type, name=obj)
rel_node = Relation(
label=relation, source_id=subj_node.id, target_id=obj_node.id
)
valid_triplets.append((subj_node, rel_node, obj_node))
return valid_triplets
Additionally, ensure that the SchemaLLMPathExtractor
is being used to extract edges from node relationships. Here is the relevant code snippet:
from typing import Literal
from llama_index.llms.ollama import Ollama
from llama_index.core.indices.property_graph import SchemaLLMPathExtractor
entities = Literal["PERSON", "PLACE", "ORGANIZATION"]
relations = Literal["HAS", "PART_OF", "WORKED_ON", "WORKED_WITH", "WORKED_AT"]
validation_schema = [
("ORGANIZATION", "HAS", "PERSON"),
("PERSON", "WORKED_AT", "ORGANIZATION"),
("PERSON", "WORKED_WITH", "PERSON"),
("PERSON", "WORKED_ON", "ORGANIZATION"),
("PERSON", "PART_OF", "ORGANIZATION"),
("ORGANIZATION", "PART_OF", "ORGANIZATION"),
("PERSON", "WORKED_AT", "PLACE"),
]
kg_extractor = SchemaLLMPathExtractor(
llm=Ollama(model="llama3", json_mode=True, request_timeout=3600),
possible_entities=entities,
possible_relations=relations,
kg_validation_schema=validation_schema,
strict=True,
)
Ensure that:
kg_validation_schema
is correctly defined and includes the relationships you expect._prune_invalid_triplets
method is correctly filtering and adding valid triplets to the knowledge graph.SchemaLLMPathExtractor
is being used to extract and add edges from node relationships.If the issue persists, you may need to debug the extraction process to ensure that the entities and relations are being correctly identified and validated [1][2][3].
To continue talking to Dosu, mention @dosu.
@dosu It's not that problem. I started the opanai interface using Fastchat and didn't build it according to the entity and edge relationships I provided. Can LLM use the OpenAILike method to load the local model
Yes, LLM can use the OpenAILike
method to load the local model. The LocalAI
class is designed to be a self-hosted alternative to OpenAI and can be used with OpenAILike
by providing the necessary parameters such as api_key
and api_base
.
Here is an example of how you might initialize the LocalAI
class using OpenAILike
:
from llama_index.legacy.llms.openai_like import OpenAILike
from llama_index.legacy.llms.localai import LocalAI
# Initialize LocalAI with default parameters
local_ai = LocalAI()
# Use OpenAILike to load the local model
openai_like_model = OpenAILike(
api_key=local_ai.api_key,
api_base=local_ai.api_base,
# Add other necessary parameters here
)
To ensure that the knowledge graph is built according to the entity and edge relationships you provided, you need to use the appropriate methods and data structures within the LlamaIndex framework to define and construct the knowledge graph. Specifically, you should ensure that the SchemaLLMPathExtractor
is correctly configured with your kg_validation_schema
and that the relationships are being extracted and validated properly.
For example:
from llama_index.core.indices.property_graph import SchemaLLMPathExtractor
entities = Literal["PERSON", "PLACE", "ORGANIZATION"]
relations = Literal["HAS", "PART_OF", "WORKED_ON", "WORKED_WITH", "WORKED_AT"]
validation_schema = {
"PERSON": ["HAS", "PART_OF", "WORKED_ON", "WORKED_WITH", "WORKED_AT"],
"PLACE": ["HAS", "PART_OF", "WORKED_AT"],
"ORGANIZATION": ["HAS", "PART_OF", "WORKED_WITH"],
}
kg_extractor = SchemaLLMPathExtractor(
llm=openai_like_model,
possible_entities=entities,
possible_relations=relations,
kg_validation_schema=validation_schema,
strict=True,
)
Ensure that the kg_validation_schema
is correctly defined and includes the relationships you expect. This will help in building the knowledge graph according to the entity and edge relationships you provided [1].
To continue talking to Dosu, mention @dosu.
@dosu The result of the constructed knowledge graph
Displaying 22 nodes, 0 relationships
To address the issue of the constructed knowledge graph displaying nodes but no relationships, ensure the following:
Validation Schema: Verify that your kg_validation_schema
is correctly defined and includes the relationships you expect. The schema should match the entities and relationships extracted from the text.
Triplet Extraction: Ensure that the SchemaLLMPathExtractor
is correctly extracting triplets and that the _prune_invalid_triplets
method is validating and adding these triplets properly.
Logging and Debugging: Use logging and debugging tools to trace the relationship extraction process. Enable debug logging and use the LlamaDebugHandler
to track events and identify where the relationships might be failing to be detected.
Here is an example of how to enable logging and use the LlamaDebugHandler
:
import logging
import sys
from llama_index.legacy.callbacks.llama_debug import LlamaDebugHandler
import llama_index.core
# Enable debug logging
logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))
# Set up the debug handler
debug_handler = LlamaDebugHandler()
llama_index.core.set_global_handler(debug_handler)
fb
function to create edge objects with attributes and add them to the knowledge graph.Example of defining and adding edges:
function fb(t, e, i, n, o) {
var r = { from: e, to: i, type: n };
return (
t.edge && (r.attr = lb({}, t.edge)),
(r.attr = lb(r.attr || {}, o)),
null != o &&
o.hasOwnProperty("arrows") &&
null != o.arrows &&
((r.arrows = { to: { enabled: !0, type: o.arrows.type } }),
(o.arrows = null)),
r
);
}
// Example usage:
var graph = {
nodes: [],
edges: []
};
// Define nodes
graph.nodes.push({ id: 1, label: 'Node 1' });
graph.nodes.push({ id: 2, label: 'Node 2' });
// Define edge attributes
var edgeAttributes = {
arrows: { type: 'arrow' },
color: 'blue'
};
// Add edge using the fb function
var edge = fb(graph, 1, 2, 'relationshipType', edgeAttributes);
graph.edges.push(edge);
console.log(graph);
Ensure that the SchemaLLMPathExtractor
is being used to extract and add edges from node relationships, and that the relationships are being correctly identified and validated [1][2][3].
To continue talking to Dosu, mention @dosu.
@dosu
Can the Olama
method use local models and how to use them.
kg_extractor = SchemaLLMPathExtractor( llm=Ollama(model="llama3", json_mode=True, request_timeout=3600), possible_entities=entities, possible_relations=relations, kg_validation_schema=validation_schema, strict=True, )
Yes, the Ollama
method can use local models. To configure it properly in your knowledge graph construction, follow these steps:
Install Ollama: Use the one-click installer for your operating system from the Ollama home page.
Download the Llama3 model: Use the following command:
ollama pull llama3
Switch to local agent: Install the Ollama integration with:
pip install llama-index-llms-ollama
Modify your dependencies: Import Ollama instead of OpenAI:
from llama_index.llms.ollama import Ollama
Initialize the model: Set up Llama3 as your LLM:
llm = Ollama(model="llama3", request_timeout=360.0)
Construct your knowledge graph: Use the SchemaLLMPathExtractor
to construct your graph with predefined schemas:
from typing import Literal
from llama_index.llms.ollama import Ollama
from llama_index.core.indices.property_graph import SchemaLLMPathExtractor
entities = Literal["PERSON", "PLACE", "ORGANIZATION"]
relations = Literal["HAS", "PART_OF", "WORKED_ON", "WORKED_WITH", "WORKED_AT"]
validation_schema = [
("ORGANIZATION", "HAS", "PERSON"),
("PERSON", "WORKED_AT", "ORGANIZATION"),
("PERSON", "WORKED_WITH", "PERSON"),
("PERSON", "WORKED_ON", "ORGANIZATION"),
("PERSON", "PART_OF", "ORGANIZATION"),
("ORGANIZATION", "PART_OF", "ORGANIZATION"),
("PERSON", "WORKED_AT", "PLACE"),
]
kg_extractor = SchemaLLMPathExtractor(
llm=Ollama(model="llama3", json_mode=True, request_timeout=3600),
possible_entities=entities,
possible_relations=relations,
kg_validation_schema=validation_schema,
strict=True,
)
Load your data: For example, download and load sample data:
from llama_index.core import SimpleDirectoryReader
documents = SimpleDirectoryReader("./data/paul_graham/").load_data()
Create the property graph: Use the PropertyGraphIndex
to create your graph:
from llama_index.core import PropertyGraphIndex
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
index = PropertyGraphIndex.from_documents(
documents,
kg_extractors=[kg_extractor],
embed_model=HuggingFaceEmbedding(model_name="BAAI/bge-base-en-v1.5"),
show_progress=True,
)
By following these steps, you can configure and use local models with the Ollama
method in your knowledge graph construction [1][2][3][4][5].
To continue talking to Dosu, mention @dosu.
@dosu I have encountered a problem
Received notification from DBMS server: {severity: WARNING} {code: Neo.ClientNotification.Statement.FeatureDeprecationWarning} {category: DEPRECATION} {title: This feature is deprecated and will be removed in future versions.} {description: The procedure has a deprecated field. ('config' used by 'apoc.meta.graphSample' is deprecated.)} {position: line: 1, column: 1, offset: 0} for query: "CALL apoc.meta.graphSample() YIELD nodes, relationships RETURN nodes, [rel in relationships | {name:apoc.any.property(rel, 'type'), count: apoc.any.property(rel, 'count')}] AS relationships"
To resolve the deprecation warning related to the 'config' field used by apoc.meta.graphSample
in your Neo4j and LlamaIndex setup, you should update the query to remove the deprecated 'config' field. Here is the updated code snippet:
Replace:
schema_counts = self.structured_query(
"CALL apoc.meta.graphSample() YIELD nodes, relationships "
"RETURN nodes, [rel in relationships | {name:apoc.any.property"
"(rel, 'type'), count: apoc.any.property(rel, 'count')}]"
" AS relationships"
)
With:
schema_counts = self.structured_query(
"CALL apoc.meta.graphSample() YIELD nodes, relationships "
"RETURN nodes, [rel in relationships | {name: type(rel), count: rel.count}]"
" AS relationships"
)
This change ensures compatibility with future versions of Neo4j by removing the deprecated 'config' field [1].
To continue talking to Dosu, mention @dosu.
Hi @nuocheng, were you able to resolve the issue? I am also getting 22 nodes and no edges as you mentioned above. Thanks!
@KAI10 Until now, no solution has been found. I found during the testing process that when using Schema LLMPath Extractor, GPU computing resources were not utilized. I also look forward to someone solving this problem.
@dosu Does the SchemaLLMPathExtractor
method support OpenAI and how should it be used
Yes, the SchemaLLMPathExtractor
method supports OpenAI and can be used to build a knowledge graph with both nodes and relationships. Here are the steps to ensure it is used correctly:
Initialize the SchemaLLMPathExtractor
with the appropriate parameters:
llm
: This should be an instance of the OpenAI language model.extract_prompt
: Use the default or a custom prompt for extraction.kg_schema_cls
: The knowledge graph schema class.kg_validation_schema
: The validation schema for the knowledge graph.num_workers
, max_triplets_per_chunk
, and strict
: Other optional parameters.Call the SchemaLLMPathExtractor
with a list of nodes:
__call__
method will extract triplets from the nodes and return a list of nodes with the extracted triplets.Here is an example code snippet demonstrating how to use the SchemaLLMPathExtractor
:
from llama_index.core.indices.property_graph.transformations.schema_llm import SchemaLLMPathExtractor
from some_module import OpenAILanguageModel, BaseNode # Replace with actual imports
# Initialize the OpenAI language model
llm = OpenAILanguageModel(api_key="your_openai_api_key")
# Initialize the SchemaLLMPathExtractor
extractor = SchemaLLMPathExtractor(
llm=llm,
extract_prompt="Your custom or default prompt",
kg_schema_cls=YourKGSchemaClass, # Replace with your KG schema class
kg_validation_schema=YourValidationSchema, # Replace with your validation schema
num_workers=4,
max_triplets_per_chunk=10,
strict=True
)
# List of nodes to extract triplets from
nodes = [BaseNode(text="Your node text here"), ...]
# Extract triplets
extracted_nodes = extractor(nodes, show_progress=True)
# The extracted_nodes now contain the knowledge graph with nodes and relationships
This code ensures that the SchemaLLMPathExtractor
method supports OpenAI and is used correctly to build a knowledge graph with both nodes and relationships [1].
To continue talking to Dosu, mention @dosu.
@KAI10 I use the API of OpenAI to build node and edge relationships normally, but the locally launched model cannot be built.
Question Validation
Question
https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt
import nest_asyncio
nest_asyncio.apply()
from typing import Literal from llama_index.core.indices.property_graph import SchemaLLMPathExtractor from llama_index.llms.openai_like import OpenAILike
best practice to use upper-case
entities = Literal["PERSON", "PLACE", "ORGANIZATION"] relations = Literal["HAS", "PART_OF", "WORKED_ON", "WORKED_WITH", "WORKED_AT"]
define which entities can have which relations
validation_schema = { "PERSON": ["HAS", "PART_OF", "WORKED_ON", "WORKED_WITH", "WORKED_AT"], "PLACE": ["HAS", "PART_OF", "WORKED_AT"], "ORGANIZATION": ["HAS", "PART_OF", "WORKED_WITH"], }
kg_extractor = SchemaLLMPathExtractor( llm = OpenAILike( model="Qwen2-72B-Instruct", api_key="fake", api_base="http://ip:18090/v1",
is_chat_model=True), possible_entities=entities, possible_relations=relations, kg_validation_schema=validation_schema, strict=True, )
from llama_index.graph_stores.neo4j import Neo4jPGStore
from llama_index.legacy.graph_stores.neo4j import Neo4jGraphStore
graph_store = Neo4jPGStore( username="neo4j", password="password", url="neo4j://ip:7687" )
from llama_index.core import PropertyGraphIndex from llama_index.embeddings.huggingface import HuggingFaceEmbedding embedding_path = "~/embedding_model/multilingual-e5-large-instruct" index = PropertyGraphIndex.from_documents( documents, kg_extractors=[kg_extractor], embed_model=HuggingFaceEmbedding(model_name=embedding_path), property_graph_store=graph_store, show_progress=True )