Open Phuoc-Hoan-Le opened 2 years ago
Is random resized crop used when pretraining vision transformers on imagenet21k? Or is squish-resizing used?
The images are pre-processed by Inception-style cropping – see section 3.4 of the how to train your vit paper.
Is random resized crop used when pretraining vision transformers on imagenet21k? Or is squish-resizing used?