Open 6ABCD0 opened 3 years ago
hmm... What version of your PyTorch? Depending on the version, some memory leaking may be led.
My torch version is pytorch1.7
I used Pytorch 1.3 when I worked on this project. If you can, try to run it on pipenv, which will reproduce my environment. I have never faced the issue throughout over 100 epoch training.
thank you~I will try it again
@WangduoXie I had the same issue and this fixed it for me: https://discuss.pytorch.org/t/memory-leak-with-wgan-gp-loss/112117
Thanks for your notification~ Best! Wangduo
------------------ 原始邮件 ------------------ 发件人: "S-aiueo32/srntt-pytorch" @.>; 发送时间: 2021年5月15日(星期六) 凌晨1:12 @.>; @.**@.>; 主题: Re: [S-aiueo32/srntt-pytorch] When you run the train.py.How much memory have you cost? (#15)
@WangduoXie I had the same issue and this fixed it for me: https://discuss.pytorch.org/t/memory-leak-with-wgan-gp-loss/112117
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.
Just delete "torch.autograd.set_detect_anomaly(True)" in train.py and then it works.
I trained it on TITAN XP(12G), But it occured some mistakes about 'out of memory'. And I found its memory cost increased with the number of forwarding time increased(In other words, when I run "python3 train.py --use_weights --netG_pre ./pretrain_model/netG_100.pth --netD_pre ./pretrain_model/netD_100.pth", the memory cost is 5G at begin, but with the code running, the memory cost increased to 12G, and finally,it exceeded 12G ) @S-aiueo32