run-llama / llama_index

LlamaIndex is a data framework for your LLM applications
https://docs.llamaindex.ai
MIT License
35.45k stars 5k forks source link

[Question]: how can i get the nodes that have filtered by "labels"? #15925

Open 912100012 opened 1 week ago

912100012 commented 1 week ago

Question Validation

Question

neo4j-pg-store data structure as follow:

{ "identity": 140, "labels": [ "PERSON", "Entity", "Node" ], "properties": { "triplet_source_id": "e0d4a5aa-8bd8-4d80-99ae-d1cddc461c60", "file_path": "..\data\绿毛水怪节选.txt", "file_name": "绿毛水怪节选.txt", "file_type": "text/plain", "name": "老陈", "last_modified_date": "2024-09-06", "id": "老陈", "embedding": [

code as follow:

kg_nodes = graph_store.get(properties={"label": "PERSON"})

but do you know how can i get the nodes that the labels contain the "PERSON"?My code get the empty list.

dosubot[bot] commented 1 week ago

To get nodes filtered by labels in a Neo4j graph store, you should use a Cypher query that matches nodes with the desired label. The graph_store.get method does not support filtering by labels directly. Instead, you can use the structured_query method to execute a Cypher query.

Here's how you can do it:

query = "MATCH (n:PERSON) RETURN n"
result = graph_store.structured_query(query)

This query matches all nodes with the label "PERSON" and returns them [1].

To continue talking to Dosu, mention @dosu.

912100012 commented 1 week ago

i love u ,thank u.

912100012 commented 1 week ago

in the DynamicLLMPathExtractor class,i do not understand the allowed_relation_props property ,what is its mean?@dosu

dosubot[bot] commented 1 week ago

The allowed_relation_props property in the DynamicLLMPathExtractor class specifies which properties of relationships are permitted to be used during the path extraction process. This helps in filtering and focusing on relevant relationship properties when traversing the graph.

This property is an optional list that can either be a list of property names or a list of tuples containing the property name and its description:

allowed_relation_props: Optional[Union[List[str], List[Tuple[str, str]]]]

If provided, it helps in structuring the extracted information more accurately according to the predefined ontology. Setting allowed_relation_props to None or an empty list will let the LLM generate any relation properties on the fly, which can be faster but less structured [1][2][3].

To continue talking to Dosu, mention @dosu.

912100012 commented 1 week ago

what is the different between allowed_entity_types and allowed_entity_props in the DynamicLLMPathExtractor constructor method?@dosu

dosubot[bot] commented 1 week ago

The difference between allowed_entity_types and allowed_entity_props in the DynamicLLMPathExtractor constructor method is as follows:

Example usage:

kg_extractor = DynamicLLMPathExtractor(
    llm=llm,
    max_triplets_per_chunk=20,
    num_workers=4,
    allowed_entity_types=["POLITICIAN", "POLITICAL_PARTY"],
    allowed_relation_types=["PRESIDENT_OF", "MEMBER_OF"],
    allowed_relation_props=["description"],
    allowed_entity_props=["description"],
)

To continue talking to Dosu, mention @dosu.

912100012 commented 1 week ago

Please give me a prompt to extract the triplet which is a “person” type(only one person) and filter the "person" type that have the 2 or 2 up number of person. Please reference the prompt as follow:

"Extract up to {max_knowledge_triplets} knowledge triplets from the given text. " "Each triplet should be in the form of (head, relation, tail) with their respective types and properties.\n" "---------------------\n" "INITIAL ONTOLOGY:\n" "Entity Types: {allowed_entity_types}\n" "Entity Properties: {allowed_entity_properties}\n" "Relation Types: {allowed_relation_types}\n" "Relation Properties: {allowed_relation_properties}\n" "\n" "Use these types as a starting point, but introduce new types if necessary based on the context.\n" "\n" "GUIDELINES:\n" "- Output in JSON format: [{{'head': '', 'head_type': '', 'head_props': {{...}}, 'relation': '', 'relation_props': {{...}}, 'tail': '', 'tail_type': '', 'tail_props': {{...}}}}]\n" "- Use the most complete form for entities (e.g., 'United States of America' instead of 'USA')\n" "- Keep entities concise (3-5 words max)\n" "- Break down complex phrases into multiple triplets\n" "- Ensure the knowledge graph is coherent and easily understandable\n" "---------------------\n" "EXAMPLE:\n" "Text: Tim Cook, CEO of Apple Inc., announced the new Apple Watch that monitors heart health. " "UC Berkeley researchers studied the benefits of apples.\n" "Output:\n" "[{{'head': 'Tim Cook', 'head_type': 'PERSON', 'head_props': {{'prop1': 'val', ...}}, 'relation': 'CEO_OF', 'relation_props': {{'prop1': 'val', ...}}, 'tail': 'Apple Inc.', 'tail_type': 'COMPANY', 'tail_props': {{'prop1': 'val', ...}}}},\n" " {{'head': 'Apple Inc.', 'head_type': 'COMPANY', 'head_props': {{'prop1': 'val', ...}}, 'relation': 'PRODUCES', 'relation_props': {{'prop1': 'val', ...}}, 'tail': 'Apple Watch', 'tail_type': 'PRODUCT', 'tail_props': {{'prop1': 'val', ...}}}},\n" " {{'head': 'Apple Watch', 'head_type': 'PRODUCT', 'head_props': {{'prop1': 'val', ...}}, 'relation': 'MONITORS', 'relation_props': {{'prop1': 'val', ...}}, 'tail': 'heart health', 'tail_type': 'HEALTH_METRIC', 'tail_props': {{'prop1': 'val', ...}}}},\n" " {{'head': 'UC Berkeley', 'head_type': 'UNIVERSITY', 'head_props': {{'prop1': 'val', ...}}, 'relation': 'STUDIES', 'relation_props': {{'prop1': 'val', ...}}, 'tail': 'benefits of apples', 'tail_type': 'RESEARCH_TOPIC', 'tail_props': {{'prop1': 'val', ...}}}}]\n" "---------------------\n" "Text: {text}\n" "Output:\n"

@dosu

dosubot[bot] commented 1 week ago
DEFAULT_KG_TRIPLET_EXTRACT_TMPL = (
    "Extract up to {max_knowledge_triplets} knowledge triplets from the given text. "
    "Each triplet should be in the form of (head, relation, tail) with their respective types and properties.\n"
    "---------------------\n"
    "INITIAL ONTOLOGY:\n"
    "Entity Types: {allowed_entity_types}\n"
    "Entity Properties: {allowed_entity_properties}\n"
    "Relation Types: {allowed_relation_types}\n"
    "Relation Properties: {allowed_relation_properties}\n"
    "\n"
    "Use these types as a starting point, but introduce new types if necessary based on the context.\n"
    "\n"
    "GUIDELINES:\n"
    "- Output in JSON format: [{{'head': '', 'head_type': '', 'head_props': {{...}}, 'relation': '', 'relation_props': {{...}}, 'tail': '', 'tail_type': '', 'tail_props': {{...}}}}]\n"
    "- Use the most complete form for entities (e.g., 'United States of America' instead of 'USA')\n"
    "- Keep entities concise (3-5 words max)\n"
    "- Break down complex phrases into multiple triplets\n"
    "- Ensure the knowledge graph is coherent and easily understandable\n"
    "---------------------\n"
    "EXAMPLE:\n"
    "Text: Tim Cook, CEO of Apple Inc., announced the new Apple Watch that monitors heart health. "
    "UC Berkeley researchers studied the benefits of apples.\n"
    "Output:\n"
    "[{{'head': 'Tim Cook', 'head_type': 'PERSON', 'head_props': {{'prop1': 'val', ...}}, 'relation': 'CEO_OF', 'relation_props': {{'prop1': 'val', ...}}, 'tail': 'Apple Inc.', 'tail_type': 'COMPANY', 'tail_props': {{'prop1': 'val', ...}}}},\n"
    " {{'head': 'Apple Inc.', 'head_type': 'COMPANY', 'head_props': {{'prop1': 'val', ...}}, 'relation': 'PRODUCES', 'relation_props': {{'prop1': 'val', ...}}, 'tail': 'Apple Watch', 'tail_type': 'PRODUCT', 'tail_props': {{'prop1': 'val', ...}}}},\n"
    " {{'head': 'Apple Watch', 'head_type': 'PRODUCT', 'head_props': {{'prop1': 'val', ...}}, 'relation': 'MONITORS', 'relation_props': {{'prop1': 'val', ...}}, 'tail': 'heart health', 'tail_type': 'HEALTH_METRIC', 'tail_props': {{'prop1': 'val', ...}}}},\n"
    " {{'head': 'UC Berkeley', 'head_type': 'UNIVERSITY', 'head_props': {{'prop1': 'val', ...}}, 'relation': 'STUDIES', 'relation_props': {{'prop1': 'val', ...}}, 'tail': 'benefits of apples', 'tail_type': 'RESEARCH_TOPIC', 'tail_props': {{'prop1': 'val', ...}}}}]\n"
    "---------------------\n"
    "Text: {text}\n"
    "Output:\n"
    "Filter the triplets to include only those with a 'person' type (only one person). "
    "Exclude triplets where the 'person' type has 2 or more persons.\n"
)
DEFAULT_KG_TRIPLET_EXTRACT_PROMPT = PromptTemplate(
    DEFAULT_KG_TRIPLET_EXTRACT_TMPL,
    prompt_type=PromptType.KNOWLEDGE_TRIPLET_EXTRACT,
)

This prompt template will help you extract triplets with a "person" type, ensuring that only triplets with a single person are included and those with two or more persons are excluded [1].

To continue talking to Dosu, mention @dosu.