solve DPM-Solver OOM issue

Problem Description

Many MedSegDiff users have encountered the problem like https://github.com/KidsWithTokens/MedSegDiff/issues/49, https://github.com/KidsWithTokens/MedSegDiff/issues/157. It is about when we use DPM-Solver for sampling, every single sample creates a 2GB GPU memory increase, thus ending up with CUDA Out Of Memory.

Previous Solution

I once solved this problem when I downgraded my PyTorch version to 1.8.1. However, after some untrackable changes in my Python environment, the issue comes up again and Pytorch=1.8.1 won't help.

Problem Solved

After debugging, I realized that the problem is that some Cuda tensors have trouble releasing from GPU memory. Surprisingly, when I added a line of script right after DPM-Solver sampling to force the tensors detachment, the problem was solved.

Since this issue might have troubled a lot of people, I am creating this pull request. Hope it helps.

KidsWithTokens / MedSegDiff