facebookresearch / deit

Official DeiT repository
Apache License 2.0
3.94k stars 547 forks source link

Uneven memory usage among GPUs with DistributedDataParallel #194

Closed Phuoc-Hoan-Le closed 1 year ago

Phuoc-Hoan-Le commented 1 year ago

When using distributed training, I see one GPU using 20GB of memory and another GPU using 30GB of memory. Do you guys encounter this weird problem?