Open tmm1 opened 1 year ago
Interesting. Thank you for taking the time to share your knowledge. I'm a heavy torch user, so this would be useful for me. I don't think this repo is the place to implement it, but a fork of this repo certainly is.
I'm wondering if there is a C-mechanism that can be used in hpTimer. It looks like the torch.cuda.memory_allocated
call boils down to a call to torch._C._cuda_memoryStats
which is defined here:
https://github.com/pytorch/pytorch/blob/8c10be28a10e6f0ab69593a1b791d61b10f66ce5/torch/csrc/cuda/Module.cpp#L1411
Probably a way to hook into torchlib to get that info more efficiently.
Probably a way to hook into torchlib to get that info more efficiently.
I got this working: https://github.com/tmm1/line_profiler/compare/torch-memprof
EDIT: Notebook showing example usage: https://gist.github.com/tmm1/212d6141887890af41f1fdf5c73282f2
i have been exploring VRAM usage of pytorch code, and wanted to share my experience
i search for previous discussions related to memory profiling and customized stats collection, and found #188 #216
at first I experimented a bit with
pytorch_memlab
'sLineProfiler
. however the tool is geared more towards peak memory usage rather than annotating which line is responsible for the growthso that I started exploring how line_profiler's machinery could be extended or re-used instead. turns out its pretty simple:
the results show you where memory is allocated and released: