Alsa: alsa_object_flush() crashes write operation in other thread

mse2 commented 6 years ago

Because alsa_object_flush() calls audio_object_close() it crashes running write operations in other thread. A patched version is here: https://gitlab.com/mseide-msegui/pcaudiolib

rhdunn commented 6 years ago

The call to audio_object_close is to fix echos in 8a013933f66b58c18e94eae9108fd573467f792b when using snd_pcm_drop to implement flush. This is because the audio data is still in the buffer. It can be seen when using an assistive technology, and quickly skipping through letters or commands in the readline history in the terminal.

The proper fix for this to support multiple threads is to add a lock guard around the write, flush and other calls to ensure they are in sync with each other.

mse2 commented 6 years ago

I could not reproduce the data still in buffer effect without audio_object_close() in the test program https://gitlab.com/mseide-msegui/mseuniverse/tree/master/samples/audio/pcaudio testprog The sound immediately follows the frequency and amplitude settings of the sine wave generator, maybe because on my system (openSUSE Leap 42.3) Alsa actually calls Pulseaudio in order to stream sound data it seems. https://www.alsa-project.org/alsa-doc/alsa-lib/group___p_c_m.html#ga7000ca6010a1a2739daddff8e2fbb440 states:

This function stops the PCM immediately. The pending samples on the buffer are ignored. For processing all pending samples, use ::snd_pcm_drain() instead. The function is thread-safe when built with the proper option.

In eSpeak-ng demo program with the MSEgui eSpeak-ng component tassistivehandler https://gitlab.com/mseide-msegui/mseuniverse/tree/master/tools/assistive/msecoupon there a is a flush delay in Alsa which does not exist in Pulseaudio. This delay is independent of calling audio_object_close() in alsa_object_flush() or not.

The proper fix for this to support multiple threads is to add a lock guard around the write, flush and other calls to ensure they are in sync with each other.

This would make it impossible to cancel speech operation from another thread, please remember https://github.com/espeak-ng/espeak-ng/issues/388

rhdunn commented 6 years ago

The issues were with using brltty (on Debian) with espeak-ng. This was one of the fixes for the issues they were having.

Note that with the locking support, it will only be for the pcaudiolib functions, not the espeak-ng synthesis logic itself. The espeak-ng code does not call write with a large audio chunk [1]. As such, setting the cancelled flag will at most cause the last audio chunk (write call) to be fully played.

[1] The calls to audio_object_write (speech.c:143) use outbuf, who's length is controlled by the buffer_length parameter passed to espeak_ng_InitializeOutput, which defaults to 60ms. More specifically, it is set to (buffer_length * samplerate) / 500. In my testing, this results in 2646 bytes, or 1323 s16 samples being written from dispatch_audio per call. When speaking just the word "hello", this results in 12 calls to audio_object_write. That results in 0.061833s (or 61.833ms) per call, close to the specified 60ms. Therefore, with the default buffer length, there will be at most a 62ms delay in cancelling, resulting in at most 62ms of additional audio being played.

rhdunn commented 6 years ago

Note: the 0.061833s is taken from using the linux time command to measure the duration of the espeak-ng hello call at 0m0.742s, then dividing it by 12 (the number of calls to audio_object_write measured by adding a printf call before the write call.

mse2 commented 6 years ago

Note that with the locking support, it will only be for the pcaudiolib functions, not the espeak-ng synthesis logic itself.

For MSEgui with the separate speech thread it is it is important that audio_object_flush() will not be blocked by a running write operation. 60 ms is still audible. And the more locks the more possible deadlocks. ;-)

mse2 commented 6 years ago

Fred van Stappen writes that with "create_audio_device_object(pchar('sysdefault'),nil,nil)" the Alsa device will be called directly without PulseAudio stream in-between. In this mode I can reproduce that not all sound buffers will be dropped which probably produces the echos you mentioned.

mse2 commented 6 years ago

I made a bug report: http://mailman.alsa-project.org/pipermail/alsa-devel/2018-March/133323.html

rhdunn / pcaudiolib

Alsa: alsa_object_flush() crashes write operation in other thread #12