Show a smaller extract of the transcript for each paraphrased claim

Overview

[NB: this is based on the current dev branch, which is soon to be merged into main]

Currently, when the user hovers over an extracted claim, a tooltip pop-up shows an extract of the raw transcript. However, this is a "chunk", corresponding to about 1-2 minutes of the video. This can be quite long, making it hard to find the source of the claim.

Ideally:

the genAI model extracting the claim should also return the original sentence;
this should be stored in the database along with the inferred claim; and then
just the sentence should be shown in the pop up.

Requirements

Update the prompt so that it returns the original sentence along with the inferred claim.

Ideally, this should also be correctly punctuated and with an initial capital letter for readability.

Notes and additional information

Relevant code locations:

process.py / extract_claims() gets the raw_sentence_text returned by the LLM. This is stored in inferred_claims.

templates/video_analysis.html defines list-group-item which shows claim['raw_sentence_text'] as the tooltip.

We might want to consider showing 2-3 sentences if just one doesn't provide enough context.

FullFact / health-misinfo-shared