deep-learning-with-pytorch / dlwpt-code

Code for the book Deep Learning with PyTorch by Eli Stevens, Luca Antiga, and Thomas Viehmann.
https://www.manning.com/books/deep-learning-with-pytorch
4.77k stars 2.01k forks source link

Memory leak in asyncio operation #26

Open govindamagrawal opened 4 years ago

govindamagrawal commented 4 years ago

Hi, In p3ch15/request_batching_server.py, there is a excellent code for asycnio. But I am gettting a memory leak there. The GPU memory is increasing regularly. The rate of memory increment is proportional to the number of input it is getting. Can you suggest some method how to debug this? Thanks in advance.

t-vi commented 4 years ago

Hi,

oh oh. This should not happen, thank you for reporting!

What I would do as a first step is to look at process_input, specifically at the end https://github.com/deep-learning-with-pytorch/dlwpt-code/blob/323de27e517c279ae69318d9ea0a7e6f416701ba/p3ch15/request_batching_jit_server.py#L59

Do the following:

  output = our_task["output"]
  for k in our_task.keys():
     del our_task[k]
  return output

(or somesuch). Most likely, there is a circular reference somewhere, and this would help break it. You could also call gc.collect() now and then, but the first thing I'd try is to empty the task dictionary to avoid references that keep objects from being deallocated.

Can you please check if that improves things for you?

Best regards

Thomas

govindamagrawal commented 4 years ago

@t-vi thanks for replying. Actually I found out that it was due to large queue size I had given, as a result, it was storing all those tensors and memory was getting full. Thanks again.

t-vi commented 4 years ago

:sweat_smile: So now it's gone or just not as bad?