pytorch / vision

Datasets, Transforms and Models specific to Computer Vision
https://pytorch.org/vision
BSD 3-Clause "New" or "Revised" License
15.97k stars 6.92k forks source link

Add 3-augment from DeiT III #8514

Open trawler0 opened 2 months ago

trawler0 commented 2 months ago

🚀 The feature

As the title suggest, add the data augmentation from https://arxiv.org/abs/2204.07118

Motivation, pitch

This seems to be a simple recipe with good results and the Deit family is widely recognized.

Alternatives

No response

Additional context

No response

Bhavay-2001 commented 1 month ago

Hi @trawler0, did you mean something like this?

trawler0 commented 1 month ago

Hey @Bhavay-2001! Yes, that's what I mean, I should have maybe directly attached the file. The recipe helped them to train very large ViT models from scratch on imagenet and they got amazing results.

Bhavay-2001 commented 1 month ago

Hi @NicolasHug, any views on this? Should I try to add this augmentation to torchvision?

NicolasHug commented 1 month ago

Hi @trawler0 , thank you for feature request. Sure, I think this can be in scope of our augmentation strategies, even if ultimately the implementation will just be a Compose of a few building blocks.

@Bhavay-2001 thanks for offering your help. Let's see if @trawler0 would like to give this a go first, and if not then I'm happy for you to get to it. Thanks!

trawler0 commented 1 month ago

Hi @NicolasHug I am happy to add this feature.