Progressive learning of EfficientNetV2

huggingface / pytorch-image-models

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more

Apache License 2.0

31.88k stars 4.73k forks source link

I'm interested in writing a PR for this, since I use it in my own training scripts. I have it implemented by modifying the dataset transforms every epoch.

IME the main issue is that that the start of training uses far less vram compared to the end of training. Additional throughput can be had by adjusting batch size/gradient accumulation to maximize vram usage, but implementing this adjustment is nightmarish. I was trying to do it by halving/doubling the values, respectively, but the vram would not deallocate. Might be better with the timm script, since its set up differently.

huggingface / pytorch-image-models

Progressive learning of EfficientNetV2 #718