Closed dcorney closed 2 months ago
Evaluation: I took c.200 extracted claims and compared them to the source transcripts https://docs.google.com/spreadsheets/d/1DgvkLrLHfZeHJMB4kgZ2Y2ezsrecE9u1DwVrI-5ZKcs/edit#gid=82981182 About 95% were perfectly correct; some others hallucinated claims that actually appeared elsewhere in the same video; one hallucination was not in the video at all, but was 'correct'.
So no misleading claims were generated by our use of Gemini.
Overview
Currently, when we ask Gemini to identify and extract claims, it paraphrases them. This is good because it improves the readability of the claims, in contrast to the raw transcript. Part of this is to make the claims standalone without needing extra context.
However, Gemini also tries to be helpful by adding extra context that isn't in the transcript. In some cases, this can change the meaning quite significantly.
E.g. if a transcript says "i love carrots you know they're so crunchy carrots make you see in the dark", Gemini may summarise this as "Carrots are good for night vision because they're rich in vitamin A".
Requirements