Stonesjtu / pytorch_memlab

Profiling and inspecting memory in pytorch
MIT License
1.01k stars 37 forks source link

Question about Used Memory and GPU memory #44

Closed lfangyu09 closed 2 years ago

lfangyu09 commented 2 years ago

Hi,

Thanks a lot for providing this very helpful library. I have a question about Used Memory and GPU memory. I followed your code to get the Used Memory of my model for one batch (size: (16, 3, 224, 224)). It is 928.02M. But the same code for the same model could not run in 2070 super GPU (8 GiB capacity). 928.02 M vs 8 GiB. What is the difference between the Used Memory in your code and GPU memory? Thanks.

Here are the running results. image image

Stonesjtu commented 2 years ago

Are you printing the memory usage after backward (BP)?

Used memory collects all the PyTorch Tensors in Python language space.

lfangyu09 commented 2 years ago

I have made a comparison for the same model between the Used Memory by your code and GPUtil. Here are the results. The Used Memory is 409M and GPU memory by GPUtil is 6486 MiB. Is the Used Memory by your code not the GPU memory? I am confused by "all the PyTorch Tensors in Python language space.". Is there any tutorial to explain it in detail? Thanks a lot!

image image

Stonesjtu commented 2 years ago

You can place the report() before backward(); The GPUtil gets the memory PyTorch requests (or cached), while memory_reporter gets the memory PyTorch actually allocated