Closed Deaddawn closed 8 months ago
Hi, of course, you can infer long videos without subtitles. But in this case, the model cannot know the name of each character and what happens exactly in the video along the timeline. It may cause it to answer questions like "a man is doing xxx and then he xxx", which may lose the storyline of this video.
Hi, there. I am wondering is it possible to inference long video just like short videos without using subtitles?