spatialaudio / python-sounddevice

:sound: Play and Record Sound with Python :snake:
https://python-sounddevice.readthedocs.io/
MIT License
982 stars 145 forks source link

Process finished with exit code -1073741819 (0xC0000005) #350

Open comes04 opened 3 years ago

comes04 commented 3 years ago

Hi there,

I'm using the following code to record audio from 2 different devices:

import sys
from os.path import join
from queue import Queue
from threading import Thread
import time

from sounddevice import InputStream
from soundfile import SoundFile

import numpy  # Make sure NumPy is loaded before it is used in the callback

assert numpy  # avoid "imported but unused" message (W0611)

def file_writing_thread(*, q, **soundfile_args):
    with SoundFile(**soundfile_args) as f:
        while int(time.time()-t) < dur:
            data = q.get()
            if data is None:
                break
            f.write(data)

def record_device(device_id):
    def audio_callback(data, frames, time, status):
        """This is called (from a separate thread) for each audio block."""
        if status:
            print(status, file=sys.stderr)
        audio_q.put(data.copy())

    file_name = join(rec_dir, f"rec_dev{device_id}_ch{channels}.wav")

    print("#" * 40)
    stream = InputStream(
        samplerate=fs, device=device_id, channels=channels, callback=audio_callback
    )
    stream.start()
    print(f"started stream {stream} ...")

    audio_q = Queue()
    print(f"generated queue {audio_q} ...")

    thread = Thread(
        target=file_writing_thread,
        kwargs=dict(
            file=file_name,
            mode="w",
            samplerate=int(stream.samplerate),
            channels=stream.channels,
            q=audio_q,
        ),
    )
    thread.start()
    print(f"started thread {thread} ...")
    print(f'recording file "{file_name}" ...')
    print("#" * 40)

dur = 60
fs = 44100
channels = 1
rec_dir = "C:/Users/moti/Documents/HOBBY/PLABS 3D/pruebas"
t = time.time()

record_device(device_id=1)
record_device(device_id=2)

what happens is that after a minute or so i get the following on terminal and everything stops:

Process finished with exit code -1073741819 (0xC0000005)

I'm using Windows, and I try to run this same script in other windows with the same result.

¿Could anyone help me with this?

mgeier commented 3 years ago

I'm getting this error when running the code:

NameError: free variable 'audio_q' referenced before assignment in enclosing scope

When I move the definition of audio_q up a few lines, the script works for me (on Linux, I of course also had to change rec_dir).

However, it is a bit strange to leave the threads dangling like that. You should probably take a hold of the thread objects and call .join() on both of them in the end of the script (see https://docs.python.org/3/library/threading.html#threading.Thread.join).

HaHeho commented 3 years ago

I'm getting this error when running the code:

NameError: free variable 'audio_q' referenced before assignment in enclosing scope

When I move the definition of audio_q up a few lines, the script works for me (on Linux, I of course also had to change rec_dir).

I did not have that problem on Windows. Interesting that the behavior seems to be different. But that's a minor detail.

HaHeho commented 3 years ago

@comes04 I've built this. Probably over-engineered, but it seemed to me like the most flexible solution, allowing for an arbitrary number of recording streams (in theory). It also allows for KeyboardInterrupt at any time, while leaving the handling of the desired recording length outside of the audio thread.

Note that the resulting recordings might not be in perfect sync on a sample basis. Also the resulting length is probably not exactly duration * fs (but plus minus a few samples). But I suppose that is not really relevant for your application?

from os.path import join
from queue import Queue
from threading import Thread
from time import sleep

from sounddevice import InputStream
from soundfile import SoundFile

class AudioRecorder(Thread):
    def __init__(self, device_id):
        def callback(data, _, __, status):
            if status:
                print(status)
            self._q.put(data.copy())

        # print(f"Creating recording thread {id(self)}")
        super().__init__(daemon=True)

        self._is_alive = True
        self._q = Queue()
        self._stream = InputStream(
            device=device_id,
            samplerate=fs,
            channels=channels,
            callback=callback,
        )
        self._sf = SoundFile(
            file=join(rec_dir, f"rec_dev{device_id}_ch{channels}.wav"),
            mode="w",
            samplerate=int(self._stream.samplerate),
            channels=self._stream.channels,
        )

    def run(self):
        # method invoked by `Thread.start()`
        print(f'Running recording thread {id(self)} to "{self._sf.name}"...')
        with self._sf, self._stream:
            try:
                while self._is_alive:
                    self._sf.write(self._q.get())
            except KeyboardInterrupt:
                print("\nInterrupted by user.")
        print(f'... ended recording thread {id(self)} to "{self._sf.name}".')

    def join(self, timeout=None):
        # print(f"Joining recording thread {id(self)} ({timeout=}) ... ")
        self._is_alive = False
        super().join(timeout=timeout)

duration = 60  # in s
fs = 44100  # in Hz
channels = 1
rec_dir = ""

# create recording threads
recs = [AudioRecorder(device_id=1), AudioRecorder(device_id=2)]

# start recording
for rec in recs:
    rec.start()

# wait for specified time
print(flush=True)  # for correct output order
try:
    print(f"Recording for {duration:.1f} seconds ...")
    print("... or press Ctrl+C to stop the recording.")
    sleep(duration)  # in ms
except KeyboardInterrupt:
    print("\nInterrupted by user.")

# stop recording
for rec in recs:
    rec.join()

print("DONE.")
exit()
comes04 commented 2 years ago

Hi @HaHeho Thanks for building that, i just try it out and I don't get the -1073741819 (0xC0000005) error any more. Not sure why but i would stick with this. Thanks again for all your help.

mgeier commented 2 years ago

Thanks @HaHeho for this nice example!

The try block in run() is superfluous, though, because the KeyboardInterrupt will not be raised in that thread.

And is there a reason to use daemon=True? I would avoid that unless there is a good reason.

And when I tried it, flush=True didn't make a difference. Are you sure that it is needed?

HaHeho commented 2 years ago

Thanks @HaHeho for this nice example!

YW. I wanted to contribute an example with a class that derives from Thread (or Process would work in a very similar manner), since it provides a straightforward foundation. In this way, any kind of processing or behaviour can be implement in an encapsulated way. And the class instance can be easily controlled from outside.

The try block in run() is superfluous, though, because the KeyboardInterrupt will not be raised in that thread.

True.

And is there a reason to use daemon=True? I would avoid that unless there is a good reason.

I thought it would be good idea, in case the mother script dies for some reason. Why do you think it should be avoided?

And when I tried it, flush=True didn't make a difference. Are you sure that it is needed?

I've put it since logging to the same console from multiple threads can scramble the message order up (obviously). I used to put very brief sleep() times in-between to reliably get a sensible logging behavior. That is very not elegant of course, so I thought to try with flush. But I'm not really sure if I was using it correctly there anyways. What would you suggest for a good practice?

mgeier commented 2 years ago

[...] with a class that derives from Thread [...], since it provides a straightforward foundation.

It works fine, but I don't really like how you are changing the semantics of the join() method.

I think something like stop() would be clearer.

And I think encapsulation would be better in this case, or is there a reason to expose the inherited Thread API?

And it would be even better if the class were a context manager!

And is there a reason to use daemon=True? I would avoid that unless there is a good reason.

I thought it would be good idea, in case the mother script dies for some reason.

Well in that case the thread would be forcibly stopped, probably still trying to access stuff that doesn't exist anymore.

It might work fine in many cases, but I think not relying on that is generally a better idea.

If you want to handle the case of the main thread dying, you should make your class a context manager, then the threads will be correctly terminated even if the main thread throws an exception.

Why do you think it should be avoided?

I don't really have a concrete reason, just an ominous feeling. But a controlled shutdown is in general better than a forced shutdown under dubious circumstances, right?

I remember having problems with detached threads in C++. In Python it's certainly not that bad, but still, I would avoid it unless I have a good reason to use it.

And when I tried it, flush=True didn't make a difference. Are you sure that it is needed?

I've put it since logging to the same console from multiple threads can scramble the message order up (obviously). I used to put very brief sleep() times in-between to reliably get a sensible logging behavior. That is very not elegant of course, so I thought to try with flush. But I'm not really sure if I was using it correctly there anyways. What would you suggest for a good practice?

I'm not sure either. I just wanted to know whether you actually observed scrambled messages or whether you just included it out of a whim.

The problem (or rather, one of the problems) is that the behavior is very likely platform-dependent.

In general, I think it's best just to not use print() from different threads at the same time. For a small code example this doesn't really matter, but in a real application I would probably do any logging via a queue? Or probably use https://docs.python.org/3.8/library/logging.html (but I don't really have experience with that, I don't know how well this works with multiple threads)?

mgeier commented 2 years ago

BTW @HaHeho, do you want to add this example to the official examples?

HaHeho commented 2 years ago

It works fine, but I don't really like how you are changing the semantics of the join() method.

I think something like stop() would be clearer.

And I think encapsulation would be better in this case, or is there a reason to expose the inherited Thread API?

And it would be even better if the class were a context manager!

Yeah, a context manager would be handy as well. And actually more intuitive and safer (as you describe with unexpected exits later). In that case one would not even derive from Thread but implement one inside the class, I suppose? In this way, the Thread API wouldn't be exposed either.

My idea was rather to extend the functionality of Thread (hence the derivation) i.e., to also use it exactly as if it were one (which was natively does not provide to be used in a context). This would've been much clearer however when actually naming the class RecorindThread or such.

Nevertheless, doing both would be quite easy. I did not bother to even implement a stop() method there, since the provided functionality in that example was so simple. When building a proper example, that should be done, of course.

And when I tried it, flush=True didn't make a difference. Are you sure that it is needed?

I'm not sure either. I just wanted to know whether you actually observed scrambled messages or whether you just included it out of a whim.

The problem (or rather, one of the problems) is that the behavior is very likely platform-dependent.

In general, I think it's best just to not use print() from different threads at the same time. For a small code example this doesn't really matter, but in a real application I would probably do any logging via a queue? Or probably use https://docs.python.org/3.8/library/logging.html (but I don't really have experience with that, I don't know how well this works with multiple threads)?

The terminal output isn't seriously scrambled (in a sense that messages would be interleave) ... at least I've never seen that happen. It is only the case that the order of somewhat simultaneous messages is mixed up somewhat randomly, which is expected and not too bad. However, sometimes it gets confused about the line breaks at the end of a messages, which is more annoying actually. No idea why, but this is also what happened in the simple example here (on Windows and MacOS), like

Running recording thread 140250583290976 to "rec_dev0_ch1.wav"...Running recording thread 140250606384512 to "rec_dev4_ch1.wav"...
Recording for 60.0 seconds ...

... or press Ctrl+C to stop the recording.

I have used logging. One can implement quite some fancy things (different verbosity levels, message filtering, a specific log message structure, multiple different logging destinations, etc.). However, one would probably always want to (also) log to the common ancestor "std out". For that there is no specific tools in logging to make this more convenient in any way, AFAIK.

BTW @HaHeho, do you want to add this example to the official examples?

Well that could be useful, don't you think? Of course not in it's current form.

mgeier commented 2 years ago

In that case one would not even derive from Thread but implement one inside the class, I suppose? In this way, the Thread API wouldn't be exposed either.

Yes, that's what I had in mind.

My idea was rather to extend the functionality of Thread (hence the derivation) i.e., to also use it exactly as if it were one (which was natively does not provide to be used in a context). This would've been much clearer however when actually naming the class RecorindThread or such.

But if you "extend" Thread, you shouldn't fundamentally change the semantics of join() (as I mentioned above), because this would be surprising for users.

And I don't really see why a user of that class would even want to use the inherited attributes.

BTW @HaHeho, do you want to add this example to the official examples?

Well that could be useful, don't you think?

Yes, I think so.

Of course not in it's current form.

We could discuss further changes in a PR.