Closed hufengshuo07 closed 3 years ago
Hi @hufengshuo07,
Using different data augmentations is due to the missing data of original Kinetics-400. Stronger augmentations help us catch up with the baseline model. Therefore, all models use the same augmentations.
In the paper, i find you mentioned "Each frame is randomly cropped so that its short side ranges in [256, 320] pixels, as in [32, 5, 25]." But in the code, frames are randomly cropped by the area of [0.08,1]. Why are the differences please? Are those maybe-so-small cropped frames meaningful to train the model? And is color jittering necessary to the generalization ability of this task? Thanks!!