run-llama / llama_index

LlamaIndex is a data framework for your LLM applications
https://docs.llamaindex.ai
MIT License
36.75k stars 5.27k forks source link

[Question]: Am I able to add metadata to KnowledgeGraphIndex? #15762

Open DataNoob0723 opened 2 months ago

DataNoob0723 commented 2 months ago

Question Validation

Question

Am I able to add metadata to KnowledgeGraphIndex?

dosubot[bot] commented 2 months ago

Yes, you are able to add metadata to KnowledgeGraphIndex. The index_struct in the provided JSON includes an extra_info field, which can be used to store additional metadata.

Here is an example of how you might add metadata to the KnowledgeGraphIndex:

{
  "index_struct": {
    "text": null,
    "doc_id": "446680de-660e-421a-8fdc-ed244bd466ce",
    "embedding": null,
    "extra_info": {
      "author": "John Doe",
      "created_at": "2023-10-01",
      "description": "This is a sample knowledge graph index."
    },
    "table": {
      "I": [
        "26270178-301a-42b3-8d46-4d9de1672b0f",
        "0d3bba4a-2234-4c3b-ab35-855c8cf34071",
        "4a791691-ff02-46cb-969c-3a0006efe45c",
        "02f5447d-3bb1-467f-8bd1-0d568f4b3a17",
        "a76f6a79-63f8-4fe7-b233-850c5defaa39",
        "d3899c11-ae95-454c-9c2a-29c93443bea9",
        "1bad022a-173f-4a2e-a127-04f054111014",
        "26995b7c-1a11-4a3b-84f5-b7861a469efc",
        "ea0a96e1-0a22-4719-83ac-68b0dd8921f5",
        "4413adae-fc4f-4008-8752-c0c922337aa9"
      ],
      "short stories": ["d3899c11-ae95-454c-9c2a-29c93443bea9"],
      "programs": ["d3899c11-ae95-454c-9c2a-29c93443bea9"],
      "on IBM 1401": ["d3899c11-ae95-454c-9c2a-29c93443bea9"],
      "microcomputers": ["f4e7ecda-58da-465d-a891-98f26a22ace9"],
      "everything": ["f4e7ecda-58da-465d-a891-98f26a22ace9"],
      "Heathkit": ["f4e7ecda-58da-465d-a891-98f26a22ace9"],
      "AI": [
        "02f5447d-3bb1-467f-8bd1-0d568f4b3a17",
        "829e7c99-799f-47a0-b86b-3d29570bd024"
      ],
      "Terry Winograd": ["02f5447d-3bb1-467f-8bd1-0d568f4b3a17"],
      "SHRDLU": ["02f5447d-3bb1-467f-8bd1-0d568f4b3a17"],
      "Lisp": [
        "829e7c99-799f-47a0-b86b-3d29570bd024",
        "318c4bda-f6ea-4396-b9ee-67051c0db2a6",
        "690ec369-005c-4cc8-bbdc-40750f65bcd4",
        "ed39a830-a116-41b9-a551-bdd348dba61d",
        "88883f20-1a13-4634-b454-eae0842ffec8",
        "edb271a6-6861-4e98-9576-7b887b7ee309"
      ],
      "own sake": ["829e7c99-799f-47a0-b86b-3d29570bd024"],
      "interesting": ["88883f20-1a13-4634-b454-eae0842ffec8"],
      "On Lisp": ["88883f20-1a13-4634-b454-eae0842ffec8"],
      "1993": ["88883f20-1a13-4634-b454-eae0842ffec8"],
      "something": ["4413adae-fc4f-4008-8752-c0c922337aa9"],
      "the Carnegie Institute": ["4413adae-fc4f-4008-8752-c0c922337aa9"],
      "but": ["0d3bba4a-2234-4c3b-ab35-855c8cf34071"],
      "students": [
        "17f36266-eb54-48e9-a985-5eb22d329af9",
        "fa1cfbb9-782b-4352-b610-cdae080b8f4f"
      ],
      "and in return the faculty wouldn't require the students to learn anything": [
        "17f36266-eb54-48e9-a985-5eb22d329af9"
      ],
      "chatting": ["17f36266-eb54-48e9-a985-5eb22d329af9"],
      "still lives": ["26995b7c-1a11-4a3b-84f5-b7861a469efc"],
      "what I was seeing": ["26995b7c-1a11-4a3b-84f5-b7861a469efc"],
      "low-level processes": ["ed39a830-a116-41b9-a551-bdd348dba61d"],
      "visual perception": ["ed39a830-a116-41b9-a551-bdd348dba61d"],
      "brains": ["ed39a830-a116-41b9-a551-bdd348dba61d"],
      "Interleaf": ["ed39a830-a116-41b9-a551-bdd348dba61d"],
      "creating documents": ["ed39a830-a116-41b9-a551-bdd348dba61d"],
      "scripting language": ["ed39a830-a116-41b9-a551-bdd348dba61d"],
      "Text": ["2cbefde0-fa0b-471e-9121-74f9c0e39f37"],
      "that it's better for technology companies to be run by product people than sales people": [
        "2cbefde0-fa0b-471e-9121-74f9c0e39f37"
      ],
      "that the low end eats the high end": [
        "2cbefde0-fa0b-471e-9121-74f9c0e39f37"
      ],
      "low end": ["06f6c175-2d00-4f5f-b95a-0c4b11e66daf"],
      "high end": ["06f6c175-2d00-4f5f-b95a-0c4b11e66daf"],
      "painting": ["fa1cfbb9-782b-4352-b610-cdae080b8f4f"],
      "cartoon": ["fa1cfbb9-782b-4352-b610-cdae080b8f4f"],
      "buyers": ["fa1cfbb9-782b-4352-b610-cdae080b8f4f"],
      "work": ["fa1cfbb9-782b-4352-b610-cdae080b8f4f"],
      "confused and demoralized": ["fa1cfbb9-782b-4352-b610-cdae080b8f4f"],
      "rich": ["a76f6a79-63f8-4fe7-b233-850c5defaa39"],
      "the web": ["a76f6a79-63f8-4fe7-b233-850c5defaa39"],
      "be a big deal": ["a76f6a79-63f8-4fe7-b233-850c5defaa39"],
      "internet": ["700e8186-0722-48e2-a84d-2ce6b0f9ac66"],
      "train": ["700e8186-0722-48e2-a84d-2ce6b0f9ac66"],
      "software": [
        "700e8186-0722-48e2-a84d-2ce6b0f9ac66",
        "b048373c-0cc1-4b99-94c4-145600acab2a"
      ],
      "web sites": ["700e8186-0722-48e2-a84d-2ce6b0f9ac66"],
      "web app": ["b048373c-0cc1-4b99-94c4-145600acab2a"],
      "possible": ["b048373c-0cc1-4b99-94c4-145600acab2a"],
      "Robert": ["ea0a96e1-0a22-4719-83ac-68b0dd8921f5"],
      "graduate student stipend": ["ea0a96e1-0a22-4719-83ac-68b0dd8921f5"],
      "seed funding": ["ea0a96e1-0a22-4719-83ac-68b0dd8921f5"],
      "We": ["ea0a96e1-0a22-4719-83ac-68b0dd8921f5"],
      "in September": ["ea0a96e1-0a22-4719-83ac-68b0dd8921f5"],
      "Trevor's": ["7ed58914-9acd-47f3-9b2e-308980b40627"],
      "Austrian Rococo": ["7ed58914-9acd-47f3-9b2e-308980b40627"],
      "We opened": ["7ed58914-9acd-47f3-9b2e-308980b40627"],
      "6 stores": ["7ed58914-9acd-47f3-9b2e-308980b40627"],
      "growth rate": ["76655c19-248a-411e-a493-91d346e29b40"],
      "startup": ["76655c19-248a-411e-a493-91d346e29b40"],
      "absolute number": ["76655c19-248a-411e-a493-91d346e29b40"],
      "Y Combinator": [
        "bb8ba457-e7b4-40eb-8d99-1071ca9f0ef5",
        "b3f6d3c3-c71f-486d-b5ca-0e970ebcf200",
        "9538c89d-eed1-4a4c-8468-51436b7ed7ea"
      ],
      "Stop being so stressed out": ["9538c89d-eed1-4a4c-8468-51436b7ed7ea"],
      "Yahoo": ["26270178-301a-42b3-8d46-4d9de1672b0f"],
      "to paint": ["26270178-301a-42b3-8d46-4d9de1672b0f"],
      "old life": ["1bad022a-173f-4a2e-a127-04f054111014"],
      "new kind of still life": ["1bad022a-173f-4a2e-a127-04f054111014"],
      "there wasn't one": ["1bad022a-173f-4a2e-a127-04f054111014"],
      "Viaweb": ["17dc8e20-9544-467f-9de8-b9d4c81c3ef3"],
      "application service provider": ["17dc8e20-9544-467f-9de8-b9d4c81c3ef3"],
      "software as a service": ["17dc8e20-9544-467f-9de8-b9d4c81c3ef3"],
      "Aspra": ["17dc8e20-9544-467f-9de8-b9d4c81c3ef3"],
      "": ["318c4bda-f6ea-4396-b9ee-67051c0db2a6"],
      "because": ["318c4bda-f6ea-4396-b9ee-67051c0db2a6"],
      "because one of the distinctive features of the language is that it has dialects": [
        "318c4bda-f6ea-4396-b9ee-67051c0db2a6"
      ],
      "online essays": ["b146558b-bb1a-4b7c-a3b6-f4861116c58c"],
      "a marginal medium": ["b146558b-bb1a-4b7c-a3b6-f4861116c58c"],
      "Socially": ["b146558b-bb1a-4b7c-a3b6-f4861116c58c"],
      "more like rants": ["b146558b-bb1a-4b7c-a3b6-f4861116c58c"],
      "Working on unprestigious things": [
        "17292ba2-5830-41be-9777-b119cd74fe7e"
      ],
      "finding something real to be discovered": [
        "17292ba2-5830-41be-9777-b119cd74fe7e"
      ],
      "One party": ["34d23b0c-5b0b-4eb1-915e-5453f9ab1760"],
      "two thirds of the other guests would be people they didn't know": [
        "34d23b0c-5b0b-4eb1-915e-5453f9ab1760"
      ],
      "One of the guests": ["34d23b0c-5b0b-4eb1-915e-5453f9ab1760"],
      "I didn't know but would turn out to like": [
        "34d23b0c-5b0b-4eb1-915e-5453f9ab1760"
      ],
      "investment firm": ["b3f6d3c3-c71f-486d-b5ca-0e970ebcf200"],
      "founders": [
        "b3f6d3c3-c71f-486d-b5ca-0e970ebcf200",
        "93730cd3-ffa0-40e4-b622-ce61f0d5bd88"
      ],
      "YC": [
        "d32357e8-4472-4f3e-b75d-0b419d76be71",
        "bb8ba457-e7b4-40eb-8d99-1071ca9f0ef5",
        "93730cd3-ffa0-40e4-b622-ce61f0d5bd88",
        "4f45f720-9ba3-444d-b156-71fd7c0fcbc8",
        "066bb6f0-9869-4ade-b3a4-df672f7fc545",
        "c8dd16ad-1247-4b34-b04f-3c1b2e66b6f4"
      ],
      "company": ["4f45f720-9ba3-444d-b156-71fd7c0fcbc8"],
      "own money": ["4f45f720-9ba3-444d-b156-71fd7c0fcbc8"],
      "batch model": ["4f45f720-9ba3-444d-b156-71fd7c0fcbc8"],
      "SFP": ["9ed536d9-8050-4ef7-81d4-f934d139afbc"],
      "Summer Founders Program": ["9ed536d9-8050-4ef7-81d4-f934d139afbc"],
      "inviting undergrads to apply": ["9ed536d9-8050-4ef7-81d4-f934d139afbc"],
      "more convenient": ["c8dd16ad-1247-4b34-b04f-3c1b2e66b6f4"],
      "better for startups": ["c8dd16ad-1247-4b34-b04f-3c1b2e66b6f4"],
      "solved one of the biggest problems": [
        "c8dd16ad-1247-4b34-b04f-3c1b2e66b6f4"
      ],
      "HN": ["ebc39888-835c-43c0-a92b-371d08c967cb"]
    }
  }
}

[1]

To continue talking to Dosu, mention @dosu.

DataNoob0723 commented 2 months ago

how to add metadata to extra_info through code?

dosubot[bot] commented 2 months ago

To add metadata to extra_info through code, you can assign a dictionary to the extra_info attribute of a Document object. Here is an example:

document = Document(page.page_content)
document.extra_info = page.metadata
documents.append(document)

This code snippet demonstrates how to loop through pages, create Document objects, and assign metadata to the extra_info attribute. This approach is confirmed by a user in the LlamaIndex community who uses a similar method:

document.extra_info = <your dict>

some sorta loop:  
    document = Document(page.page_content)
    document.extra_info = page.metadata
    documents.append(document)

This method allows you to attach any metadata you want to the extra_info field of each document [1][2].

To continue talking to Dosu, mention @dosu.