spatialaudio / python-sounddevice

:sound: Play and Record Sound with Python :snake:
https://python-sounddevice.readthedocs.io/
MIT License
980 stars 145 forks source link

Memory leak in Stream #436

Closed eagomez2 closed 1 year ago

eagomez2 commented 1 year ago

Hi,

I am trying to run a neural network with the Stream context manager to do real time inferences. When I do so, the process memory rapidly increases over time. At the beginning I though the network was the culprit. However, I did a script to do the exact same thing but taking sounddevice out of the equation and in such scenario the memory does not increases even after thousands of inferences.

Here is the model inferences outside sounddevice:

import os
import torch
from time import sleep

# Torch config
torch.set_grad_enabled(False)

# Streaming params
frame_size = 256
sample_rate = 48000

# Buffers
in_buffer = torch.zeros((1, 1, frame_size), dtype=torch.float32)
out_buffer = torch.zeros((1, 1, frame_size), dtype=torch.float32)

# Model
MODEL_FILE = os.path.join("model.pt")
model = torch.jit.load(MODEL_FILE, map_location="cpu")

if __name__ == "__main__":
    idx = 0

    while True:
        print(idx)
        in_buffer = torch.rand_like(in_buffer)
        model_out = model(in_buffer)
        idx += 1
        sleep(0.0025)

And here is where I included it and it produces a memory leak. Whenever I run the process I am monitoring it with the top command:

import os
import sys
import torch
import argparse
import numpy as np
import sounddevice as sd

# Torch config
torch.set_grad_enabled(False)

# Streaming params
frame_size = 256
sample_rate = 48000

# Buffers
in_buffer = torch.zeros((1, 1, frame_size), dtype=torch.float32)
out_buffer = torch.zeros((1, 1, frame_size), dtype=torch.float32)

# Model
MODEL_FILE = os.path.join("model.pt")
model = torch.jit.load(MODEL_FILE, map_location="cpu")

def callback(in_data, out_data, frames, time, status):
    global in_buffer, out_buffer

    if status:
        print(status)

    # Predict output
    model_out = model(torch.from_numpy(in_data.reshape(1, 1, -1)))
    out_data[:] = model_out.cpu().detach().numpy().reshape(-1, 1)

# Command line args
parser = argparse.ArgumentParser(
    description="sounddevice leak tester",
    formatter_class=argparse.ArgumentDefaultsHelpFormatter)

# Positional args
parser.add_argument("-i", "--input", type=int, help="Input device")
parser.add_argument("-o", "--output", type=int, help="Output device")
parser.add_argument("-l", "--latency", type=float, default=0.3,
                    help="Desired I/O latency")

args, _ = parser.parse_known_args()

try:
    with sd.Stream(device=(args.input, args.output), samplerate=sample_rate,
                   blocksize=frame_size, dtype=np.float32, channels=1,
                   callback=callback):
        print("Processing input audio...")
        input()

except KeyboardInterrupt:
    sys.exit(0)

If it helps, here is some further information of where I am testing it:

Operating system: macOS Monterey 12.6 python version: 3.8.13 sounddevice version: 0.4.5 torch version: 1.12.0 numpy version: 1.22.4

Here you can obtain the model (it is an untrained copy of it, therefore it will produce a distorted version of the input only, which is fine) model.pt.zip

Thanks!

mgeier commented 1 year ago

Thanks for reporting this!

It would be great if you could reproduce this without the torch dependency.

There have been some memory-related issues in the past, maybe there is some useful information: #58, #140 (only Windows?), #158

eagomez2 commented 1 year ago

Hi @mgeier ,

Thanks for the prompt reply. Unfortunately I am not sure how to reproduce this without torch as dependency. This is why in the first place I though the issue was the model itself, until I tested it separately.

I saw the issues you mentioned but couldn't find anything to solve this. The only additional detail I can think of is that the sound output of the model sounds as expected without any dropouts as the memory usage increases. Please let me know if there are any additional tests that I could try in order to find out further details.

mgeier commented 1 year ago

I don't really have experience with debugging memory problems, but quite some time ago I read about https://pympler.readthedocs.io/, which I've never used, but it sounds like it could help?

mgeier commented 1 year ago

Here are a few more random links about memory profiling, I haven't tried any of this:

eagomez2 commented 1 year ago

Hi @mgeier ,

Thanks a lot! I haven't had time yet to check what you sent, but I'll keep you posted whenever I could perform some tests.

eagomez2 commented 1 year ago

Hi @mgeier ,

I was spending some time setting up pympler and when I about to try, I realised that there is no leak anymore. I am not sure what fixed it, but I am getting new deprecation warnings from pytorch that I don't remember having before. I know I have updated some modules during these days, so I guess some dependency update actually fixed it, but I have no scientific proof to track down the origin of the original leak, unfortunately.

Here are the libraries that have been changed compared to the first post:

Operating system: macOS Monterey 12.6 (same) python version: 3.8.13 (same) sounddevice version: 0.4.5 (same) torch version: 1.12.1 (updated) numpy version: 1.22.0 (downgraded)

I think this issue can be considered solved for now, but I'll keep you posted if by any chance I find something new about it.

mgeier commented 1 year ago

Thanks for the update!

I'll close this for now, but we can re-open it whenever new information comes to light.

mgeier commented 1 year ago

For future reference, another memory profiler: https://github.com/pythonspeed/filprofiler