decisionforce / TPN

[CVPR 2020] Temporal Pyramid Network for Action Recognition
https://decisionforce.github.io/TPN/
Apache License 2.0
394 stars 55 forks source link

About data augmentation. #31

Closed hufengshuo07 closed 3 years ago

hufengshuo07 commented 3 years ago

In the paper, i find you mentioned "Each frame is randomly cropped so that its short side ranges in [256, 320] pixels, as in [32, 5, 25]." But in the code, frames are randomly cropped by the area of [0.08,1]. Why are the differences please? Are those maybe-so-small cropped frames meaningful to train the model? And is color jittering necessary to the generalization ability of this task? Thanks!!

limbo0000 commented 3 years ago

Hi @hufengshuo07,

Using different data augmentations is due to the missing data of original Kinetics-400. Stronger augmentations help us catch up with the baseline model. Therefore, all models use the same augmentations.