Short-form videos (e.g. TikTok, IG Reels, YT Shorts) have very different characteristics from longer form videos. E.g. they rely more on visual elements (inc. on-screen text); they may be reactions to other videos; they may combine two (or more) videos into one. Before we can process them using our regular tools, we need to extract some kind of text representation.
Requirements
Explore whether Gemini can take a short-form video and return a 'narrative' of what is being said and what is being implied, with a suitable prompt.
Explore how Raphael responds to these kinds of videos (which it probably fails to do much with).
Overview
Short-form videos (e.g. TikTok, IG Reels, YT Shorts) have very different characteristics from longer form videos. E.g. they rely more on visual elements (inc. on-screen text); they may be reactions to other videos; they may combine two (or more) videos into one. Before we can process them using our regular tools, we need to extract some kind of text representation.
Requirements
Explore whether Gemini can take a short-form video and return a 'narrative' of what is being said and what is being implied, with a suitable prompt.
Explore how Raphael responds to these kinds of videos (which it probably fails to do much with).