Open Aayush2007 opened 2 years ago
Yes,as mentioned in our paper, we adopted three kinds of data augmentation strategies (random crop, horizontal flip and random temporal scaling) during training.
@ycmin95, Why you choose horizontal flip methods? Does it change the meaning of sign?
Relevant information can be found in the Table 3. (Ablation results of augmentation) of the supplymentary material, which can be download from: https://openaccess.thecvf.com/content/ICCV2021/supplemental/Min_Visual_Alignment_Constraint_ICCV_2021_supplemental.zip
What are the video augmentation options used in the pre-trained model ([Dropbox]) ? In the code I can see that these are the ones uncommented, is that the case for the pretrained model? dataset/dataloader_video.py