torch.bmm(), CUDA out of memory.

heykeetae / Self-Attention-GAN

Pytorch implementation of Self-Attention Generative Adversarial Networks (SAGAN)

2.51k stars 470 forks source link

torch.bmm(), CUDA out of memory. #59

Open lyricgoal opened 3 years ago

lyricgoal commented 3 years ago

In models.networks.py, energy = torch.bmm(proj_query.permute, proj_key) RuntimeError: CUDA out of memory. Tried to allocate 268.21 GiB (GPU 4; 10.92 GiB total capacity; 1.80 GiB already allocated; 8.52 GiB free; 47.59 MiB cached) Could you please give me some advice?

ncuxomun commented 3 years ago

Your batch size is too large. Try reducing it to 10 for example. I am also trying to resolve a similar issue at the moment.

philnovv commented 2 years ago

Self-attention is notoriously memory hungry. Try reducing batch size, or applying the self-attention layers more selectively.

The fact that you are attempting to allocate 260 GiB tells me that you're are perhaps using 3D inputs? In this case, self-attention will simply not work. You will have to look into implementations with linear complexity.

liku-amare commented 1 year ago

I had the same issue, changing the batch size to 8 worked for me