facebookresearch / vicreg

VICReg official code base
MIT License
516 stars 87 forks source link

Does this code need large GPU memory? #7

Closed bqdeng closed 2 years ago

bqdeng commented 2 years ago

Hello! During the training of cifar10 dataset, do you encounter that when the batchsize is set to 2048, you can't run on the dual card nvidia3090? Display memory overflow.

So I changed the batch size to 256, which is still a memory overflow.

Finally, I had no choice but to change it to 128 to run.

However, compared with simclr and swav codes, the batch size that can be set under the same device is not so small. I can generally run 2048 or 1024. Is this normal?

My device is nvidia3090, dual card, with 48g of running video memory. The training data set is cifar10

If you can easily answer, I will be very happy!

Adrien987k commented 2 years ago

Hi,

I don't know for Cifar-10, but on ImageNet with images of size 3 224 224, a batch of size 2048 do not fit on a single GPU, but is distributed across 32 GPUs, the batch size for a single GPUs is then 64.

Given that Cifar images are much smaller you should be able to fit much bigger batches on your GPU, especially if you have 48G of memory.

I think that in that case the bottleneck might be the projector that has a lot of parameters. Can you try to run the code with --mlp 4096-128 instead of --mlp 8192-8192-8192 ?

bqdeng commented 2 years ago

Thank you for your answer.

Your intuition is right. I think I need to think it over again. Thank you for your reply. If you can, please do not close this issue for the time being. In a few days, I'll share the results of running code on a small dataset. As the end of this question.

Thank you again for your great work and enthusiastic reply!

Adrien987k commented 2 years ago

I am closing the issue, if you want to chat more you can contact me at abardes@fb.com