Closed fspegni closed 3 years ago
Trying to do some profiling of the memory usage (also following this suggestions: https://stackoverflow.com/questions/59129812/how-to-avoid-cuda-out-of-memory-in-pytorch) I put some logging in the following functions:
model/mlp.py
, function forward
main.py
, function eval
, outer loop (for ln in loader_name: ...
)main.py
, function eval
, inner loop (for batch_idx ... in enumerate(process): ...
)Here is the log file: ms-g3d.log
The last invocation of the forward
function is on the following layer:
DEBUG:root:Layer: BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
In the initial ticket I forgot to mention the relevant parameters of my system:
Hi @fspegni,
Thanks a lot for your interest! If I recall correctly, 4GB memory should be well enough for just testing the pretrained models. One quick thing you can try is just to reduce test_batch_size
in the corresponding config files (only training should be sensitive to batch size).
Thanks, I was able to accomplish this task by adding --test-batch-size N
(for N=2 or N=16, on two different platforms with different GPUs) when invoking the main.py
script inside eval_pretrained.sh
script. Thanks
Since I was able to run the tests by adjusting the parameter, I self-close the issue. Thanks for helping
Hi,
first of all thanks for sharing your poject. I'm trying to replicate your steps, but get stuck due to low GPU memory available (I've about 4GB of memory in my GPU, is that not enough?).
I followed the instructions but get an error at the
bash eval_pretrained.sh
step, allegedly because of torch is trying to allocate too much memory at once (it asks for a chunk of >800MB, see a log below). Is there any way to by-pass this problem?