Adriankhl / godot-llm

LLM in Godot
MIT License
91 stars 2 forks source link

Analyze text with GDLlama #13

Open Ogoku opened 1 month ago

Ogoku commented 1 month ago

First of all, thank you for this addon, it opens up so many possibilities. I do not have extensive experience with LLMs, although I have been using ChatGPT for a long time. Currently, I would like to implement the following functionality, I use the following prompt in ChatGPT:

Task: Analyze the text below sentence by sentence on a scale from 1 to 10, how exciting the respective sentence is. Give me your rating in the following format: Sentence number: {Rating, Justification}. Replace the word in this pattern with the corresponding value. Provide your answer only in a JSON format.

and then I get the following output:

{
    "1": { "Rating": 7, "Justification": "The imagery of a single rowboat gliding through thick fog at dusk creates a sense of mystery and sets an intriguing scene." },
    "2": { "Rating": 5, "Justification": "The description of the calm sea and lack of wind adds to the atmosphere but doesn't significantly heighten the excitement." },
    "3": { "Rating": 8, "Justification": "The detailed description of the man on the narrow wooden bench, gripping the oars, adds a personal and tactile element, increasing the tension." },
    "4": { "Rating": 6, "Justification": "The calm and controlled rowing strokes create a sense of quiet anticipation but are not highly exciting." },
    "5": { "Rating": 9, "Justification": "The focus on the man's tense face, his narrowed eyes, and the piercing gaze through the fog add significant suspense and engagement." },
    "6": { "Rating": 8, "Justification": "The aim at the dark outline of the shoreline and the focus on a single small light heighten the sense of purpose and mystery, adding excitement." }
}

Currently, I am not able to replicate this with GDLlama. It doesn't provide any json output, just plain text and it seems like it continues the given text instead of analyzing it. Am I possibly using the wrong node?

Adriankhl commented 1 month ago

Hi, thanks for your interest. To achieve what you want, you need to use json schema to enforce the output format.

Here is a simple example for a single sentence:

var _sentence_schema = {
    "type": "object",
    "properties": {
        "Rating": {
            "type": "integer",
            "minimum": 1,
            "maximum": 10,
        },
        "Justification": {
            "type": "string",
            "minLength": 10,
        },
    },
    "required": ["Rating", "Justification"]
}

var prompt_sentence = "Analyze the sentence below on a scale from 1 to 10, how exciting the respective sentence is.

Nothing is so necessary for a young man as the company of intelligent women.
"
func _ready():
    var llm = GDLlama.new()
    llm.model_path = "./models/Meta-Llama-3-8B-Instruct.Q5_K_M.gguf"
    llm.should_output_prompt = false
    llm.should_output_bos = false
    llm.should_output_eos = false
        var s = llm.generate_text_json(prompt_sentence, JSON.stringify(_sentence_schema))
    print(s)
Adriankhl commented 1 month ago

If the model is smart enough, it can also analyze a paragraph

var _paragraph_schema = {
    "type": "array",
    "minItems": 1,
    "maxItems": 1,
    "items": {
        "type": "object",
        "properties": {
            "Sentencce number": {
                "type": "integer"
            },
            "Rating": {
                "type": "integer",
                "minimum": 1,
                "maximum": 10,
            },
            "Justification": {
                "type": "string",
                "minLength": 10,
            },
        },
        "required": ["Sentencce number", "Rating", "Justification"]
    }
}

var paragraph = "Life is everything. Life is God. Everything shifts and moves, and this movement is God. And while there is life, there is delight in the self-awareness of the divinity. To love life is to love God. The hardest and most blissful thing is to love this life in one's suffering, in the guiltlessness of suffering."

var prompt_paragraph = "Analyze the text below sentence by sentence on a scale from 1 to 10, how exciting the respective sentence is.\n\n" + paragraph

func _ready():
    var llm = GDLlama.new()
    llm.model_path = "./models/Meta-Llama-3-8B-Instruct.Q5_K_M.gguf"
    llm.should_output_prompt = false
    llm.should_output_bos = false
    llm.should_output_eos = false

    var sentence_count: int = paragraph.count(".")
    _paragraph_schema["minItems"] = sentencce_count
    _paragraph_schema["maxItems"] = sentencce_count
    var s = llm.generate_text_json(prompt_paragraph, JSON.stringify(_paragraph_schema))
    print(s)

It seems to work well with my llama3 8B model. Getting the correct sentence_count might be a bit tricky because a sentence can be stopped by multiple different symbols. You may as well just remove sentencce_count and _paragraph_schema["maxItems"] to see how well the model performs.

This is my output

[{"Justification": "This sentence is the introduction to the text, setting the tone and providing a philosophical perspective, making it a calm and thought-provoking but not overly exciting sentence." ,"Rating": 4 ,"Sentencce number": 1 }, 
{"Justification": "The text is making a bold claim about the connection between life and God, which might spark some interest, but it's a fairly straightforward statement and lacks any surprising or unexpected elements." ,"Rating": 6 ,"Sentencce number": 2 },
{"Justification": "The idea of movement being God is an interesting concept, but it's not entirely new or groundbreaking, and the sentence is fairly short and straightforward, making it more thought-provoking than exciting." ,"Rating": 5 ,"Sentencce number": 3 }, 
{"Justification": "This sentence is a bit more interesting as it suggests that the self-awareness of divinity is a source of delight, but it's still a fairly abstract concept and lacks any concrete or unexpected elements." ,"Rating": 6 ,"Sentencce number": 4 }, 
{"Justification": "The idea of loving life being equivalent to loving God is a familiar concept, but it's not particularly surprising or exciting, and the sentence is fairly short and straightforward." ,"Rating": 4 ,"Sentencce number": 5 }, 
{"Justification": "This sentence is the most interesting and exciting as it suggests that even in suffering, there is a way to find bliss, which is a more unexpected and thought-provoking idea." ,"Rating": 8 ,"Sentencce number": 6} ]

I think it is possible to exactly reproduce the chatgpt output format, but then it takes some effort to define the json schema correctly using a for loop.