hustvl / Vim

Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
Apache License 2.0
2.55k stars 159 forks source link

Image Augmentation #73

Open chokevin8 opened 2 months ago

chokevin8 commented 2 months ago

Hi, I am aware that the authors utilized random cropping, random horizontal flipping, label-smoothing regularization, mixup, and random erasing as data augmentations. However, there hasn't been an ablation study on augmentations. Would the performance of vision mamba decrease if data augmentation was to be eliminated or modified?