kramars-realspeak / fm-gai-lottie-true-false-v1

0 stars 0 forks source link

YLE_DATASET2 #4

Open Peter96K opened 2 days ago

Peter96K commented 2 days ago

Task :

Create 5 fm-gai-lottie-true-false-v1.json objects and test them in the sandbox with Ss.

SAMPLE KEDTECHLA PLAYGROUND OUTPUT (adapt parameters as needed):

 {
    "id": "",
    "media": {
        "image_src": "",
        "style": "lego"
    },
    "sentence": "Apples taste better than pears.",
    "correct_answer": "True",
    "cefr_level": "pre-a1",
    "target_vocabulary": [
        "apples",
        "pears"
    ],
    "target_grammar": [
        "present continuous tense"
    ],
    "submitted": false,
    "metadata": {
        "model_alias": "fm-gai-lottie-true-false",
        "model_version": "1.0"
    }
}

After testing objects are supposed to be stored in data.json in the data directory of the repository.

CPTNFreedom commented 11 hours ago

“Cats are more important than dogs” open ai gave b1 level. CEFRpy gave it a1 level. IM more inclined to agree with open ai there.

The image generated doesn’t seem to be saved to the json. When I tried to bring up an old json the picture had changed slightly. It’s because the output.json only updated the first time. Small issue but it means I’ve generated a lot of images today.

Masters: used as a warm-up activity. Asked them to choose true or false and give an explanation. Good for some but too hard for others.

Moon hunters: activity works reasonably well. Had some discussion but not the best. Tried to get them to make their own. Still quite a time-consuming activity that way.

Avengers: Used as warm up. Not much discussion coming from them, but it is a class that doesn’t have a lot of chatter so it’s hard to say.

FBAddicts: had some decent conversation coming from them. They are much higher level so they were able to express some good ideas.

Other: I think some sort of schedule might be quite useful. Both me and Mandy have used the TF on the same students on the same day. Doing it in both lesson may not be the best idea. Just an idea, may not be feasible or even useful.

Peter96K commented 14 minutes ago

M_HUNT : which discussion was good/bad? How long was the 'Idea-to-Ss'?

Peter96K commented 8 minutes ago

CEFR_PY : maybe this is a good candidate to be added to the SparkTank : "track how often cefr_py correctly assesses word levels, incorrect assessments should be flagged by an analyst and over time we can evaluate and compare frameworks and decommissionless accurate ones in favor of better alternatives."