Closed mostar39 closed 3 years ago
And I wonder. written in the config file dict(type='SampleAVAFrames', clip_len=32, frame_interval=2), What is the meaning of My data also cut a video into a photo at 30fps like AVA data, is clip_len meaningful with that?
clip_len
means the roi_feature of the bbox is extracted from a clip with clip_len x frame_interval
frames. So keep its original value.
I am learning Spatial Temporal Action Detection SlowFast with my data. However, in the case of AVA data, I saw that the video length was fixed from 902 to 1798. I would like to proceed with learning with each video being very short, but the length of each video is different. I changed timestamp_start in ava_dataset.py to 1 according to my data, can I change it like this? Also, I don't know how to set timstamp_end because the length of the video is different. Is there any problem with learning even if the length of the video is different?