Open bpasero opened 7 months ago
Thanks for using the Speech SDK and filing this issue. We have been able to reproduce the issue you are seeing, and have added fixing this issue to our backlog. We will update here once we have an update.
As a temporary workaround, you may want to consider passing a null
value as the AudioConfig
to the SpeechSynthesizer
constructor. You can then subscribe to the Synthesizing
event which will be raised whenever the SDK receives new audio from the service. You can then pass this audio to your player of choice which should give you more control over when the audio playback stops. Please note however that calling StopSpeakingAsync
may still stall for ~10-15 seconds due the underlying issue.
(B-7172399)
Thanks, good to see it can be reproduced and I am looking forward to the fix 👍
This item has been open without activity for 19 days. Provide a comment on status and remove "update needed" label.
Hello, I am using version 1.37.0, and I have encountered a similar issue.
stopSpeaking
does not immediately terminate the playback process; it only stops the speaker from playing.
For example, if I generate a 14-second audio and execute stopSpeaking
at 10 seconds, then let speakResult = synthesizer?.speakSsml(ssml)
will immediately return with speakResult?.reason=9(SPXResultReason_SynthesizingAudioCompleted)
instead of 1(SPXResultReason_Canceled)
. Moreover, the callback registered with synthesizer?.addSynthesisCompletedEventHandler
is triggered after waiting for 4 seconds, rather than the callback registered with synthesizer?.addSynthesisCanceledEventHandler
.
let ssml =
"<speak version='1.0' xml:lang='en-US' xmlns='http://www.w3.org/2001/10/synthesis' xmlns:mstts='http://www.w3.org/2001/mstts'><voice name='\(identifier)'>\(mstts)</voice></speak>"
let speakResult = try self.synthesizer?.speakSsml(ssml)
print(speakResult?.reason ?? "")
try synthesizer?.stopSpeaking()
Here is a demo repositorie: https://github.com/wtto00/flutter_azure_speech/tree/main/example
The swift code is in https://github.com/wtto00/flutter_azure_speech/blob/eb419b89fcc16903cabaa8f9820559d93ed80861/ios/Classes/AzureSpeechPlugin.swift#L294
This item has been open without activity for 19 days. Provide a comment on status and remove "update needed" label.
Please keep.
This item has been open without activity for 19 days. Provide a comment on status and remove "update needed" label.
Please keep.
Any update on this issue, I am also stuck with this.
wow can't believe this issue is still open after so long. Any update?
Hi, I suffer from the same issue. Still reproducable. Is there any workaround?
A temporary solution:
Use connection.close()
instead of synthesizer.stopSpeaking()
.
hm... where do you get the connection object from? In my case the connection is somewhere under the hood of SpeechSynthesizer which I call/create using the config from SpeechConfig.fromSubscription
connection
from Connection.fromSynthesizer
Here is a example: stopSynthesize
connection
from Connection.fromSynthesizerHere is a example: stopSynthesize
Ah, thanks. It's a bit faster with cancellation than synthesizer.close()
, but still the audio already buffered plays several seconds.
I now found a workaround by accessing the private audio object:
//DANGER!FRAGILE uses private objects to work around issue: https://github.com/Azure-Samples/cognitive-services-speech-sdk/issues/2350
function KillAudio(synthesizer: sdk.SpeechSynthesizer) {
// kill the audio
const audio: HTMLAudioElement | undefined = synthesizer.privAdapter?.privSessionAudioDestination?.privDestination?.privAudio;
if(audio)
{
audio.pause();
audio.currentTime = 0;
}
}
(this immediatelly stops the audio playback)
and then I call synthesizer.close()
.
But this is fragile code accessing private objects, I need to find a way to access that audio object in an official way.
I have used the microsoft-cognitiveservices-speech-sdk for viseme so I have used ref in ReactJS for the synthesizer.
import * as sdk from "microsoft-cognitiveservices-speech-sdk"
const synthesizeSpeech = text => { return new Promise((resolve, reject) => { if (!speechSynthesizerRef.current) { const speechConfig = sdk.SpeechConfig.fromSubscription( import.meta.env.VITE_SPEECH_KEY, import.meta.env.VITE_SPEECH_REGION ) speechSynthesizerRef.current = new sdk.SpeechSynthesizer(speechConfig) let speechStarted = false ..... }
And to stop the speech, I did this const stopSpeech = () => { try { setImageIndex(0) setIsAudioPlaying(false) if (speechSynthesizerRef.current) { const audio = speechSynthesizerRef.current.privAdapter?.privSessionAudioDestination?.privDestination ?.privAudio if (audio) { audio.pause() audio.currentTime = 0 speechSynthesizerRef.current.close() speechSynthesizerRef.current = null } } } catch (e) { console.error("Error in stopSpeech:", e) } }
This helped in stopping the speech as well as resetting the synthesis, so if you play it again, the audio starts too.
Describe the bug
A call to
SpeechSynthesizer.StopSpeakingAsync()
does not stop synthesis for a very long time, up to 30 seconds. The log file is here: speech.logThis issue was previously reported without action at https://github.com/Azure-Samples/cognitive-services-speech-sdk/issues/1836 and https://github.com/Azure-Samples/cognitive-services-speech-sdk/issues/2264
To Reproduce
We are building a node.js binding for Speech SDK and the C++ sources mimic the samples. The synthesis is implemented here: https://github.com/microsoft/node-speech/blob/967976ce0f4887a2b5b27f486e5209a51588516f/src/main.cc#L477
The call to
StopSpeakingAsync
here: https://github.com/microsoft/node-speech/blob/967976ce0f4887a2b5b27f486e5209a51588516f/src/main.cc#L539To reproduce from that module:
18.x
on the systemgit clone https://github.com/microsoft/node-speech.git
index.ts
and append the snippet [1] at the endnpm i
node index.js
[1]
Expected behavior
Calling
SpeechSynthesizer.StopSpeakingAsync
immediately stops synthesis.Version of the Cognitive Services Speech SDK
1.37.0
Platform, Operating System, and Programming Language
Additional context
This issue does not reproduce on macOS or Linux!