Nicous20 / FunQA

FunQA benchmarks funny, creative, and magic videos for challenging tasks including timestamp localization, video description, reasoning, and beyond.
https://funqa-benchmark.github.io/
MIT License
94 stars 1 forks source link

HumorQA set annotation: Mismatch between ground truth annotation timestamp and video sample time stamp #15

Open Hasnat79 opened 1 week ago

Hasnat79 commented 1 week ago

Hi @Nicous20 and everyone, While going through the test sample and annotations, I found that the ground truth start and end time stamps are not matching with the test sample duration. image

Can you please provide us with the annotations aligned within the sample video durations? It would be helpful to calculate IoU scores in localization tasks over this valuable dataset. Otherwise, we cannot match the ground truth time span with the prediction time span.

Thanks.

Jingkang50 commented 1 week ago

@Nicous20 I think the output is frame id right? frame_id = idx_in_second * fps Could you confirm what is the fps in use?

Nicous20 commented 1 week ago

@Hasnat79 HumorQA and MagicQA are frame-counting and CreativeQA is second. 0 200 means 0 - 200 frames (whole video) and the setting fps is 30.

Hasnat79 commented 1 week ago

Hi guys , I think now it makes sense like 200frames//30fps = 6 sec. Can you please tell me that approximately how much percentage of the cases the whole video (not any small segment) is annotated as humoristic/funny?

Thanks.

Nicous20 commented 1 week ago

@Hasnat79 Annotators were asked to avoid selecting entire segments of the video as much as possible, so the proportion is very small, as shown in Figure 2(c). (The red areas indicate the frequently selected regions.) image