marcderbauer / humanitarianKG

0 stars 0 forks source link

Refine Prompt #16

Closed marcderbauer closed 7 months ago

marcderbauer commented 8 months ago

Will replace prompts with the prompt from this article: https://towardsdatascience.com/how-to-convert-any-text-into-a-graph-of-concepts-110844f22a1a

The ones from the ipynb seemed more precisely engineered. Need to look if I can copy some techniques over from that

marcderbauer commented 8 months ago

Output doesn't work. Only outputs one relational triplet. Will need to find a way to fix this.

marcderbauer commented 8 months ago

related to / doubling #12

marcderbauer commented 8 months ago

https://github.com/neo4j/NaLLM Using prompt from this PR things work a lot better.

Example sentence:

Conflict and insecurity in South Sudan are once again at levels which would typically be associated with civil war, and would indicate that a substantial collapse in state authority is underway. Yet the government in Juba endures, and the political bargain holding together the various military, security, and rebel factions since 2018 has mostly held during this surge in violence. What explains this soaring violence, and what does this tell us about the strategies the government and ruling elites use to survive in such conditions?

output with temperature = 0:

[
    [
        "South Sudan",
        "has_conflict",
        "Civil War",
        {
            "levels": "high"
        }
    ],
    [
        "Government",
        "located_in",
        "Juba",
        {}
    ],
    [
        "Government",
        "endures_despite",
        "Conflict",
        {}
    ],
    [
        "Political Bargain",
        "holds_together",
        "Factions",
        {
            "since": 2018
        }
    ],
    [
        "Violence",
        "surges_in",
        "South Sudan",
        {}
    ],
    [
        "Government",
        "uses_strategies_to_survive",
        "Violence",
        {}
    ]
]

output with temperature = 0.5:

[
    [
        "South Sudan",
        "has_conflict",
        "Civil War",
        {
            "level": "high"
        }
    ],
    [
        "Government",
        "located_in",
        "Juba"
    ],
    [
        "Government",
        "endures_despite",
        "Conflict",
        {}
    ],
    [
        "Political Bargain",
        "holds_together",
        "Military",
        {
            "since": 2018
        }
    ],
    [
        "Political Bargain",
        "holds_together",
        "Security",
        {
            "since": 2018
        }
    ],
    [
        "Political Bargain",
        "holds_together",
        "Rebel Factions",
        {
            "since": 2018
        }
    ],
    [
        "Violence",
        "surges_during",
        "Political Bargain",
        {}
    ],
    [
        "Government",
        "uses_strategies",
        "Survival",
        {}
    ],
    [
        "Ruling Elites",
        "use_strategies",
        "Survival",
        {}
    ]
]

Will have to experiment with the temperature setting but very happy with the output for now.
Low temp = low recall, high precision
high temp = higher recall, lower precision (potentially)
→ With given example, there were no hallucinations for temp=0.5, but can't generalise from that.

Other settings:

MODEL_NAME = "gpt-4-1106-preview"
TEMPERATURE = "SEE ABOVE"
TOP_P = 0.8
SEED = 1234
marcderbauer commented 8 months ago

output.txt Current output when iterating over a longer text line by line. Will want to merge dictionaries (or even replace datastructure) Also want to remove the fields where it outputs properties: null

marcderbauer commented 7 months ago

Prompt works in general now, closing ticket.