ajhamdi / MVTN

pytorch implementation of the ICCV'21 paper "MVTN: Multi-View Transformation Network for 3D Shape Recognition"
98 stars 9 forks source link

Memory leak #2

Closed whu-lee closed 2 years ago

whu-lee commented 3 years ago

"killed" will appear in second epoch of code training.

1631711289(1)

ajhamdi commented 3 years ago

@whu-lee could you please explain more? What is causing this in the code? what are the potential suspects?

whu-lee commented 3 years ago

I'm sorry, I do not have a clear description of the problem.

In the second cycle of training, the program will be interrupted inexplicably, and the terminal displays "killed". I check the memory and it shows that the memory has been completely used, and I suspect it is the cause of the memory leak.

It may be caused by the calculation of loss during training. I changed "total_loss += loss" to "total_loss += loss.item()".There is no memory leak at present.

By the way , I add "torch.cuda.empty_cache()" for the problem of excessive memory usage of cuda.

ajhamdi commented 3 years ago

@whu-lee I updated the code as per your recommendations. Please let me know if the problem persists.