When I try to run the code energy = torch.bmm(proj_query, proj_key), the program runs into the RuntimeError: CUDA out of memory. My Graphics card's memory is 12GB and I am looking for a way to reduce the size of intermediate variables.i.e.energywhich in my case is 1 x 65536 x 65536. I've already used torch.no_grad() and split the intermediate matrixes into smaller sub-matrix, then use del to release the memory. But it doesn't seem to work, would you please show me some subtle tips to help me with this kind of problem? (My batch size is 1, the input size is 256 x 256)
When I try to run the code
energy = torch.bmm(proj_query, proj_key)
, the program runs into the RuntimeError: CUDA out of memory. My Graphics card's memory is 12GB and I am looking for a way to reduce the size of intermediate variables.i.e.energy
which in my case is 1 x 65536 x 65536. I've already usedtorch.no_grad()
and split the intermediate matrixes into smaller sub-matrix, then usedel
to release the memory. But it doesn't seem to work, would you please show me some subtle tips to help me with this kind of problem? (My batch size is 1, the input size is 256 x 256)