FullFact / health-misinfo-shared

Raphael health misinformation project, shared by Full Fact and Google
MIT License
0 stars 0 forks source link

Investigate multimodal analysis of short-form videos by Gemini (spike) #156

Open dcorney opened 1 month ago

dcorney commented 1 month ago

Overview

Short-form videos (e.g. TikTok, IG Reels, YT Shorts) have very different characteristics from longer form videos. E.g. they rely more on visual elements (inc. on-screen text); they may be reactions to other videos; they may combine two (or more) videos into one. Before we can process them using our regular tools, we need to extract some kind of text representation.

Requirements

Explore whether Gemini can take a short-form video and return a 'narrative' of what is being said and what is being implied, with a suitable prompt.

Explore how Raphael responds to these kinds of videos (which it probably fails to do much with).

### Tasks
- [ ] https://github.com/FullFact/health-misinfo-shared/issues/164
- [ ] https://github.com/FullFact/health-misinfo-shared/issues/167
- [ ] https://github.com/FullFact/health-misinfo-shared/issues/165
- [ ] https://github.com/FullFact/health-misinfo-shared/issues/166