Increasing batch size - Githubissues

keyu-tian / SparK

[ICLR'23 Spotlight🔥] The first successful BERT/MAE-style pretraining on any convolutional network; Pytorch impl. of "Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling"

MIT License

1.41k stars 82 forks source link

I'm trying to run pretraining with Resnet50 with my data, and running into out-of-memory issues with this.

Initially, I was using two V100s (32 GB) and the maximum batch size I could go to was 256. However, I can't go higher with even larger memory GPUs — I tried using an A100 both 40GB and 80GB, and the maximum batch size I could use without running into out-of-memory issues was still 256.

I'm a bit confused and was wondering if there's a knowledge gap in my understanding; let me know if I'm missing anything!

keyu-tian / SparK

Increasing batch size #82