CVPR2023 - Activating More Pixels in Image Super-Resolution Transformer Arxiv - HAT: Hybrid Attention Transformer for Image Restoration
1.14k
stars
134
forks
source link
OOM Issue (A6000 GPU, Batch Size 8 per GPU) #144
Open
2minkyulee opened 1 month ago
Thanks for your great work!
I'm having an Out of Memory issue with the following configuration: (Probably the default training setting and identical VRAM size)
Disabling the CUDA prefetcher didn't help. A batch size of 7 per GPU works fine, so it seems to be a slight OOM problem.
Gradient checkpointing also worked, but led to slower training. It would be better if we have alternatives.
Are there any configurations that I might have missed to reduce VRAM usage? Or do you have any other suggestions?