Closed nmichlo closed 3 years ago
Thank you for reporting this! I am able to reproduce this behavior exactly, and I am working on debugging this.
I should also note, the only options for regularization are "l2"
and "kl"
. I will change the code to only allow these values as well.
I believe I have fixed the issue. If you could verify locally by installing torchsort with:
pip install "torchsort>=0.1.3"
It would be much appreciated. I am no longer encountering the issue on my hardware. If I don't hear back soon I will release the next version regardless. Thank you again for providing a detailed account of the issue and a minimal example! 😄
For future reference (to anyone out there). The leak was caused by storing a tensor directly to the ctx
object in a torch.autograd.Function
. Using ctx.save_for_backward()
instead will properly free the memory when it is no longer needed.
I have tested it locally and it is now working on my system. 🎉
Thank you for your prompt response, fixes and explanation!
Computing the
soft_rank
over a CUDA tensor that requires gradients results in a memory leak, when a regularisation other thanl2
is chosen. However, under the same conditionssoft_sort
seems to work correctly.