Open mse2 opened 6 years ago
The call to audio_object_close
is to fix echos in 8a013933f66b58c18e94eae9108fd573467f792b when using snd_pcm_drop
to implement flush. This is because the audio data is still in the buffer. It can be seen when using an assistive technology, and quickly skipping through letters or commands in the readline history in the terminal.
The proper fix for this to support multiple threads is to add a lock guard around the write, flush and other calls to ensure they are in sync with each other.
I could not reproduce the data still in buffer effect without audio_object_close() in the test program https://gitlab.com/mseide-msegui/mseuniverse/tree/master/samples/audio/pcaudio The sound immediately follows the frequency and amplitude settings of the sine wave generator, maybe because on my system (openSUSE Leap 42.3) Alsa actually calls Pulseaudio in order to stream sound data it seems. https://www.alsa-project.org/alsa-doc/alsa-lib/group___p_c_m.html#ga7000ca6010a1a2739daddff8e2fbb440 states:
This function stops the PCM immediately. The pending samples on the buffer are ignored. For processing all pending samples, use ::snd_pcm_drain() instead. The function is thread-safe when built with the proper option.
In eSpeak-ng demo program with the MSEgui eSpeak-ng component tassistivehandler
https://gitlab.com/mseide-msegui/mseuniverse/tree/master/tools/assistive/msecoupon
there a is a flush delay in Alsa which does not exist in Pulseaudio. This delay is independent of calling audio_object_close() in alsa_object_flush() or not.
The proper fix for this to support multiple threads is to add a lock guard around the write, flush and other calls to ensure they are in sync with each other.
This would make it impossible to cancel speech operation from another thread, please remember https://github.com/espeak-ng/espeak-ng/issues/388
The issues were with using brltty (on Debian) with espeak-ng. This was one of the fixes for the issues they were having.
Note that with the locking support, it will only be for the pcaudiolib functions, not the espeak-ng synthesis logic itself. The espeak-ng code does not call write with a large audio chunk [1]. As such, setting the cancelled flag will at most cause the last audio chunk (write call) to be fully played.
[1] The calls to audio_object_write
(speech.c:143) use outbuf, who's length is controlled by the buffer_length parameter passed to espeak_ng_InitializeOutput
, which defaults to 60ms. More specifically, it is set to (buffer_length * samplerate) / 500
. In my testing, this results in 2646 bytes, or 1323 s16 samples being written from dispatch_audio
per call. When speaking just the word "hello", this results in 12 calls to audio_object_write
. That results in 0.061833s (or 61.833ms) per call, close to the specified 60ms. Therefore, with the default buffer length, there will be at most a 62ms delay in cancelling, resulting in at most 62ms of additional audio being played.
Note: the 0.061833s is taken from using the linux time
command to measure the duration of the espeak-ng hello
call at 0m0.742s
, then dividing it by 12 (the number of calls to audio_object_write
measured by adding a printf
call before the write call.
Note that with the locking support, it will only be for the pcaudiolib functions, not the espeak-ng synthesis logic itself.
For MSEgui with the separate speech thread it is it is important that audio_object_flush()
will not be blocked by a running write operation. 60 ms is still audible. And the more locks the more possible deadlocks. ;-)
Fred van Stappen writes that with "create_audio_device_object(pchar('sysdefault'),nil,nil)" the Alsa device will be called directly without PulseAudio stream in-between. In this mode I can reproduce that not all sound buffers will be dropped which probably produces the echos you mentioned.
I made a bug report: http://mailman.alsa-project.org/pipermail/alsa-devel/2018-March/133323.html
Because alsa_object_flush() calls audio_object_close() it crashes running write operations in other thread. A patched version is here: https://gitlab.com/mseide-msegui/pcaudiolib