GPU Memory Usage Extremely High

Hi, I am not an expert in pytorch and would appreciate some help with understanding my vram utilization. When exchanging

images = torch.rand(128, 3, 256, 256, device=image_a.device),

the needed vram explodes (20GB instead of 2GB), when

https://github.com/lucidrains/byol-pytorch/blob/0c3ab5409181852f8495ef924dce9186f94d9126/byol_pytorch/byol_pytorch.py#L157 is executed. I can't find an explanation for this behavior :/ It is relevant to me, because it also happens in my use case that obviously doesn't include a random tensor but is more tricky to explain.

Many thanks in advance :)

lucidrains / byol-pytorch