Open 912100012 opened 1 week ago
To get nodes filtered by labels in a Neo4j graph store, you should use a Cypher query that matches nodes with the desired label. The graph_store.get
method does not support filtering by labels directly. Instead, you can use the structured_query
method to execute a Cypher query.
Here's how you can do it:
query = "MATCH (n:PERSON) RETURN n"
result = graph_store.structured_query(query)
This query matches all nodes with the label "PERSON" and returns them [1].
To continue talking to Dosu, mention @dosu.
i love u ,thank u.
in the DynamicLLMPathExtractor class,i do not understand the allowed_relation_props property ,what is its mean?@dosu
The allowed_relation_props
property in the DynamicLLMPathExtractor
class specifies which properties of relationships are permitted to be used during the path extraction process. This helps in filtering and focusing on relevant relationship properties when traversing the graph.
This property is an optional list that can either be a list of property names or a list of tuples containing the property name and its description:
allowed_relation_props: Optional[Union[List[str], List[Tuple[str, str]]]]
If provided, it helps in structuring the extracted information more accurately according to the predefined ontology. Setting allowed_relation_props
to None
or an empty list will let the LLM generate any relation properties on the fly, which can be faster but less structured [1][2][3].
To continue talking to Dosu, mention @dosu.
what is the different between allowed_entity_types and allowed_entity_props in the DynamicLLMPathExtractor constructor method?@dosu
The difference between allowed_entity_types
and allowed_entity_props
in the DynamicLLMPathExtractor
constructor method is as follows:
allowed_entity_types
: This is a list of initial entity types for the ontology. It specifies the types of entities that the extractor should recognize and extract from the text. If set to None
, the LLM will infer the entity types on the fly.
allowed_entity_props
: This is an optional list of initial entity properties for the ontology. It can be either a list of property names or a list of tuples containing property names and their descriptions. This specifies the properties that the entities can have. If set to None
, the LLM will generate any entity properties as needed, or skip property generation if an empty list is provided [1][2].
Example usage:
kg_extractor = DynamicLLMPathExtractor(
llm=llm,
max_triplets_per_chunk=20,
num_workers=4,
allowed_entity_types=["POLITICIAN", "POLITICAL_PARTY"],
allowed_relation_types=["PRESIDENT_OF", "MEMBER_OF"],
allowed_relation_props=["description"],
allowed_entity_props=["description"],
)
To continue talking to Dosu, mention @dosu.
Please give me a prompt to extract the triplet which is a “person” type(only one person) and filter the "person" type that have the 2 or 2 up number of person. Please reference the prompt as follow:
"Extract up to {max_knowledge_triplets} knowledge triplets from the given text. " "Each triplet should be in the form of (head, relation, tail) with their respective types and properties.\n" "---------------------\n" "INITIAL ONTOLOGY:\n" "Entity Types: {allowed_entity_types}\n" "Entity Properties: {allowed_entity_properties}\n" "Relation Types: {allowed_relation_types}\n" "Relation Properties: {allowed_relation_properties}\n" "\n" "Use these types as a starting point, but introduce new types if necessary based on the context.\n" "\n" "GUIDELINES:\n" "- Output in JSON format: [{{'head': '', 'head_type': '', 'head_props': {{...}}, 'relation': '', 'relation_props': {{...}}, 'tail': '', 'tail_type': '', 'tail_props': {{...}}}}]\n" "- Use the most complete form for entities (e.g., 'United States of America' instead of 'USA')\n" "- Keep entities concise (3-5 words max)\n" "- Break down complex phrases into multiple triplets\n" "- Ensure the knowledge graph is coherent and easily understandable\n" "---------------------\n" "EXAMPLE:\n" "Text: Tim Cook, CEO of Apple Inc., announced the new Apple Watch that monitors heart health. " "UC Berkeley researchers studied the benefits of apples.\n" "Output:\n" "[{{'head': 'Tim Cook', 'head_type': 'PERSON', 'head_props': {{'prop1': 'val', ...}}, 'relation': 'CEO_OF', 'relation_props': {{'prop1': 'val', ...}}, 'tail': 'Apple Inc.', 'tail_type': 'COMPANY', 'tail_props': {{'prop1': 'val', ...}}}},\n" " {{'head': 'Apple Inc.', 'head_type': 'COMPANY', 'head_props': {{'prop1': 'val', ...}}, 'relation': 'PRODUCES', 'relation_props': {{'prop1': 'val', ...}}, 'tail': 'Apple Watch', 'tail_type': 'PRODUCT', 'tail_props': {{'prop1': 'val', ...}}}},\n" " {{'head': 'Apple Watch', 'head_type': 'PRODUCT', 'head_props': {{'prop1': 'val', ...}}, 'relation': 'MONITORS', 'relation_props': {{'prop1': 'val', ...}}, 'tail': 'heart health', 'tail_type': 'HEALTH_METRIC', 'tail_props': {{'prop1': 'val', ...}}}},\n" " {{'head': 'UC Berkeley', 'head_type': 'UNIVERSITY', 'head_props': {{'prop1': 'val', ...}}, 'relation': 'STUDIES', 'relation_props': {{'prop1': 'val', ...}}, 'tail': 'benefits of apples', 'tail_type': 'RESEARCH_TOPIC', 'tail_props': {{'prop1': 'val', ...}}}}]\n" "---------------------\n" "Text: {text}\n" "Output:\n"
@dosu
DEFAULT_KG_TRIPLET_EXTRACT_TMPL = (
"Extract up to {max_knowledge_triplets} knowledge triplets from the given text. "
"Each triplet should be in the form of (head, relation, tail) with their respective types and properties.\n"
"---------------------\n"
"INITIAL ONTOLOGY:\n"
"Entity Types: {allowed_entity_types}\n"
"Entity Properties: {allowed_entity_properties}\n"
"Relation Types: {allowed_relation_types}\n"
"Relation Properties: {allowed_relation_properties}\n"
"\n"
"Use these types as a starting point, but introduce new types if necessary based on the context.\n"
"\n"
"GUIDELINES:\n"
"- Output in JSON format: [{{'head': '', 'head_type': '', 'head_props': {{...}}, 'relation': '', 'relation_props': {{...}}, 'tail': '', 'tail_type': '', 'tail_props': {{...}}}}]\n"
"- Use the most complete form for entities (e.g., 'United States of America' instead of 'USA')\n"
"- Keep entities concise (3-5 words max)\n"
"- Break down complex phrases into multiple triplets\n"
"- Ensure the knowledge graph is coherent and easily understandable\n"
"---------------------\n"
"EXAMPLE:\n"
"Text: Tim Cook, CEO of Apple Inc., announced the new Apple Watch that monitors heart health. "
"UC Berkeley researchers studied the benefits of apples.\n"
"Output:\n"
"[{{'head': 'Tim Cook', 'head_type': 'PERSON', 'head_props': {{'prop1': 'val', ...}}, 'relation': 'CEO_OF', 'relation_props': {{'prop1': 'val', ...}}, 'tail': 'Apple Inc.', 'tail_type': 'COMPANY', 'tail_props': {{'prop1': 'val', ...}}}},\n"
" {{'head': 'Apple Inc.', 'head_type': 'COMPANY', 'head_props': {{'prop1': 'val', ...}}, 'relation': 'PRODUCES', 'relation_props': {{'prop1': 'val', ...}}, 'tail': 'Apple Watch', 'tail_type': 'PRODUCT', 'tail_props': {{'prop1': 'val', ...}}}},\n"
" {{'head': 'Apple Watch', 'head_type': 'PRODUCT', 'head_props': {{'prop1': 'val', ...}}, 'relation': 'MONITORS', 'relation_props': {{'prop1': 'val', ...}}, 'tail': 'heart health', 'tail_type': 'HEALTH_METRIC', 'tail_props': {{'prop1': 'val', ...}}}},\n"
" {{'head': 'UC Berkeley', 'head_type': 'UNIVERSITY', 'head_props': {{'prop1': 'val', ...}}, 'relation': 'STUDIES', 'relation_props': {{'prop1': 'val', ...}}, 'tail': 'benefits of apples', 'tail_type': 'RESEARCH_TOPIC', 'tail_props': {{'prop1': 'val', ...}}}}]\n"
"---------------------\n"
"Text: {text}\n"
"Output:\n"
"Filter the triplets to include only those with a 'person' type (only one person). "
"Exclude triplets where the 'person' type has 2 or more persons.\n"
)
DEFAULT_KG_TRIPLET_EXTRACT_PROMPT = PromptTemplate(
DEFAULT_KG_TRIPLET_EXTRACT_TMPL,
prompt_type=PromptType.KNOWLEDGE_TRIPLET_EXTRACT,
)
This prompt template will help you extract triplets with a "person" type, ensuring that only triplets with a single person are included and those with two or more persons are excluded [1].
To continue talking to Dosu, mention @dosu.
Question Validation
Question
neo4j-pg-store data structure as follow:
{ "identity": 140, "labels": [ "PERSON", "Entity", "Node" ], "properties": { "triplet_source_id": "e0d4a5aa-8bd8-4d80-99ae-d1cddc461c60", "file_path": "..\data\绿毛水怪节选.txt", "file_name": "绿毛水怪节选.txt", "file_type": "text/plain", "name": "老陈", "last_modified_date": "2024-09-06", "id": "老陈", "embedding": [
code as follow:
kg_nodes = graph_store.get(properties={"label": "PERSON"})
but do you know how can i get the nodes that the labels contain the "PERSON"?My code get the empty list.