Closed xqq-hub closed 4 years ago
with torch.no_grad():
means, you will not save gradient. Definitely, without gradient, you use less memory.
For training mode, you should save all gradient and actually this model uses a lot of memory.
Currently, it uses more than 9GB with batch_size 4. if you have lesser memory than this, please use batch size 1 or 2.
after pytorch 1.0, as I know, Variable
is the same as Tensor. I have no idea about this.
The small size of batch size will work in most of out of memory cases.
Batch_size 1 will still report memory error, maybe because my 4G memory is too small. Anyway, thank you very much for your reply!
hi,If I use code in Line 236 of ./trainer.py
response_map = F.interpolate(response_map.unsqueeze(0), size = [resize, resize])
it will report CUDA memory error when I change it tobefore_upsample = Variable(response_map.unsqueeze(0)) response_map = F.upsample(before_upsample, size = [resize, resize]) response_map = response_map.data.squeeze()
anddef train(): net.train() with torch.no_grad():
it will be ok,But I don't know if it's right.and in Line 125 and 185 of ./trainer.py if it should belogits, _, _, _ = net(images)