Closed albseb511 closed 1 year ago
@albseb511 Thank you for using JS Speech SDK, and writing this issue up. Could you provide sample code for single user reproduction? This error should only occurs during speech synthesis, and it's an audio output error in the SDK caused by a sourceBuffer.appendBuffer throw, but unfortunately the SDK eats the original error message (which could help us understand why the appendBuffer failed) and replaces it with that "buffer filled" message.
I am using this in a react app. its a big large to share the entire thing
The following is a sample code which is used for audio synthesis
const startSpeechRecognition = async () => {
const audioConfig = sdk.AudioConfig.fromDefaultMicrophoneInput();
if (!config) {
const { token, region } = (await getAuthTokenAzure()) ?? {};
if (!token || !region) return console.error(`Error getting token or region`);
setConfig({ t: token, r: region });
}
const speechConfig = sdk.SpeechConfig.fromAuthorizationToken(config?.t, config?.r);
speechConfig.speechRecognitionLanguage = 'en-US';
recognizerRef.current = new sdk.SpeechRecognizer(speechConfig, audioConfig);
setLoading.on();
recognizerRef.current.recognizing = handleRecognizing;
recognizerRef.current.recognized = handleRecognized;
recognizerRef.current.canceled = handleCanceled;
recognizerRef.current.sessionStopped = handleSessionStopped;
recognizerRef.current.startContinuousRecognitionAsync(
() => {},
(err) => {
recognizerRef.current?.close();
setLoading.off();
}
);
};
const handleRecognizing = (s, e) => {
// handle response
};
const handleRecognized = (s, e) => {
if (e.result.reason == sdk.ResultReason.RecognizedSpeech) {
// logic
} else if (e.result.reason == sdk.ResultReason.NoMatch) {
// logic
}
};
const handleCanceled = (s, e) => {
if (e.reason == sdk.CancellationReason.Error) {
console.error("Error in speech recognition:", e.errorDetails);
}
recognizerRef.current?.stopContinuousRecognitionAsync();
};
const handleSessionStopped = (s, e) => {
recognizerRef.current?.stopContinuousRecognitionAsync();
};
we have seen system breaking while we are speaking and while synthesizing as well. But let me double check that for you, since i am guessing you are talking about the audio data is getting pushed into the array buffer only happens during synthesis. I will try to see when i am speaking the issue comes up again.
In the meanwhile, This is for speech synthesis part
I did try v1.3.1 as well, and error did come up.
const startTextToSpeech = async (text: string, cancelEndCallback?: boolean): Promise<() => void> => {
// Creates an audio instance.
if (playerRef.current?.id) {
playerRef.current?.pause()
playerRef.current?.close()
audioConfig.current?.close()
playerRef.current = null
}
const player = new sdk.SpeakerAudioDestination()
playerRef.current = player
player.onAudioEnd = () => {
// handle response
}
audioConfig.current = sdk.AudioConfig.fromSpeakerOutput(player)
if (!config) {
const { token, region } = (await getAuthTokenAzure()) ?? {}
if (!token || !region) return console.log(`Error getting token or region`)
setConfig({ t: token, r: region })
const speechConfig = sdk.SpeechConfig.fromAuthorizationToken(token, region)
speechConfig.speechSynthesisVoiceName = voice
speechSythesizerRef.current = new sdk.SpeechSynthesizer(speechConfig, audioConfig.current)
} else {
const speechConfig = sdk.SpeechConfig.fromAuthorizationToken(config?.t, config?.r)
speechConfig.speechSynthesisVoiceName = voice
speechSythesizerRef.current = new sdk.SpeechSynthesizer(speechConfig, audioConfig.current)
}
// Receives a text from console input and synthesizes it to speaker.
try {
speechSythesizerRef.current.speakTextAsync(
text,
(result) => {
if (result) {
// debug statement with description
speechSythesizerRef.current?.close()
audioConfig.current?.close()
speechSythesizerRef.current = null
return result.audioData
}
},
(error) => {
console.log(error)
speechSythesizerRef.current?.close()
audioConfig.current?.close()
speechSythesizerRef.current = null
}
)
speechSythesizerRef.current.synthesisStarted = () => {
// debug statement
}
} catch (err) {
toast({
title: 'Error',
description: 'Error with text to speech',
status: 'error',
duration: 5000,
isClosable: true,
})
}
return () => {
console.log(`closing player`)
playerRef.current?.pause()
playerRef.current?.close()
audioConfig.current?.close()
playerRef.current = null
speechSythesizerRef.current?.close()
speechSythesizerRef.current = null
event.current?.close()
}
}
@albseb511 Thanks for including sample code. Before you assign to any "Ref.current" variable, please make sure the existing Ref.current has been closed and nulled. This could easily be a memory leak (or just non-optimally disposing of resources) where an existing recognizer/synthesizer instance loses its assignment to a Ref.current without being closed. Example of what should be added:
if (!!speechSynthesizerRef.current) {
speechSynthesizerRef.current.close();
speechSynthesizerRef.current = null;
}
if (!config) {
const { token, region } = (await getAuthTokenAzure()) ?? {}
if (!token || !region) return console.log(`Error getting token or region`)
setConfig({ t: token, r: region })
const speechConfig = sdk.SpeechConfig.fromAuthorizationToken(token, region)
speechConfig.speechSynthesisVoiceName = voice
speechSythesizerRef.current = new sdk.SpeechSynthesizer(speechConfig, audioConfig.current)
} else {
const speechConfig = sdk.SpeechConfig.fromAuthorizationToken(config?.t, config?.r)
speechConfig.speechSynthesisVoiceName = voice
speechSythesizerRef.current = new sdk.SpeechSynthesizer(speechConfig, audioConfig.current)
}
Could you add that (and for the speechRecognizer as well) and see if that helps?
I think we do manage this. But let me see again If this is an issue. Since at almost every place we do manage garbage collection.
And we have noticed that it breaks for the first time. So there wouldn't have been other instances for this.
I think there was a miss of this at the beginning like you mentioned. we do cleanups, but only on failure or other cases, on change. i think it makes sense to have it at the start as well. We will test it with live users, and get back to you.
@glharper @yulin-li Thanks so much, i think most of the issues are sorted. i should have asked this earlier, this seemed to have been a miss for a while. Although we have observed 1-2 cases breaking, but i am assuming its due to some other reason.
A bit off topic, i had two more follow up questions
If you can guide me on the right resources also it will be great.
Describe the bug Speech recognition, speech synthesis fails every now and then. The error
buffer filled, pausing addition of binaries until space is made
comes up as a response. Not been able to debug this.To Reproduce The bug is quite random. But 1 out of 5 times this comes up. Especially longer conversations.
Expected behavior The application seems to be crashing when it's synthesizing or when it's recognizing audio
Version of the Cognitive Services Speech SDK
Platform
Additional context I am not sure if i have a similar problem
This is the error that is coming up. This issue comes up on two cases
Help is appreciated