Memory Leak when Training US-Net

JiahuiYu / slimmable_networks

Slimmable Networks, AutoSlim, and Beyond, ICLR 2019, and ICCV 2019

Other

914 stars 131 forks source link

Hi @JiahuiYu , I am trying to train US-MobileNet, but scale up to [1.0, 2.0]. However, I get the 'CUDA out of memory' error. During training, the memory varies between 1000MB to 11000MB, but after some iterations, it suddenly got 'CUDA out of memory'. I got the same issue when training US-ResNet. But it is fine with US-MobileNet_[0.25, 1].

One thing weird is that when I fix the 4 width(e.g. [1.0, 1.5, 1.7, 2.0], whatever, just like Slimmable Network), I won't have the memory issue. And the memory is fixed about 4200MB.

I am guessing some tensors or graphs are not freed. But I don't know how to debug it. Do you have the same issue?

JiahuiYu / slimmable_networks

Memory Leak when Training US-Net #20