lucidrains / byol-pytorch

Usable Implementation of "Bootstrap Your Own Latent" self-supervised learning, from Deepmind, in Pytorch
MIT License
1.73k stars 246 forks source link

GPU Memory Usage Extremely High #94

Closed ClemensSchwarke closed 2 months ago

ClemensSchwarke commented 2 months ago

Hi, I am not an expert in pytorch and would appreciate some help with understanding my vram utilization. When exchanging

https://github.com/lucidrains/byol-pytorch/blob/0c3ab5409181852f8495ef924dce9186f94d9126/byol_pytorch/byol_pytorch.py#L265 with

images = torch.rand(128, 3, 256, 256, device=image_a.device),

the needed vram explodes (20GB instead of 2GB), when

https://github.com/lucidrains/byol-pytorch/blob/0c3ab5409181852f8495ef924dce9186f94d9126/byol_pytorch/byol_pytorch.py#L157 is executed. I can't find an explanation for this behavior :/ It is relevant to me, because it also happens in my use case that obviously doesn't include a random tensor but is more tricky to explain.

Many thanks in advance :)