facebookresearch / AVT

Code release for ICCV 2021 paper "Anticipative Video Transformer"
Apache License 2.0
152 stars 28 forks source link

Regarding sampling method and strategy #19

Closed ofir1080 closed 2 years ago

ofir1080 commented 2 years ago

Hi again :) I have had some wonders regarding the frame sampling method. Would be highly appreciated to clear that out... :) Looking at expt. 09_ek55_avt settings, the following parameters are:

As I see it, in practice out of a 20-second input clip (1200 frames), only 10 are sampled, where each frame represents 1-second sub-clip. As a result (due to req_fps=1 and sampling_strategy='last_clip') the model only looks at the last 10 seconds. Is that correct? If so, what is the actual roll of tau_o?

Thank you very much!

rohitgirdhar commented 2 years ago

Hi, Yes you are correct: the effective \tau_o in this case would be 10s, and we can just set the tau_o to 10s and get the same results.

ofir1080 commented 2 years ago

Get it, thank you!

So just to clarify, as I understand, the sub-clips in all experiments are represented as a single clip right? Does it support more that one frame per sub-clip? And about the stride, it is always set to 1. Modifying this parameter will change the the sampling rate. won't it? Thanks @rohitgirdhar!

rohitgirdhar commented 2 years ago

Ah sorry for missing this qs. So it should support >1 frame in a sub-clip, which is what I used to extract irCSN-152 features. Though for all configs released it doesn't need it. So the stride in this case defines how the subclips are sampled, not how the frames are sampled within the subclip: https://github.com/facebookresearch/AVT/blob/2d6781d5315a4c53bd059b1cd11ee46bd4427648/datasets/base_video_dataset.py#L699 By default it is set to the same as the number of frames in the subclip so it selects non overlapping subclips (https://github.com/facebookresearch/AVT/blob/2d6781d5315a4c53bd059b1cd11ee46bd4427648/conf/data/default.yaml#L13)

ofir1080 commented 2 years ago

Thanks!