Sample code for the Microsoft Cognitive Services Speech SDK
MIT License
2.68k
stars
1.79k
forks
source link
one and a half minutes of silence followed by human voices always results in the error 'Due to service inactivity, the client buffer exceeded maximum size. Resetting the buffer.' #2386
I found that when autoDetectConfig is set to more than two languages, the recognition of the audio I provided, which starts with one and a half minutes of silence followed by human voices, always results in the error 'Due to service inactivity, the client buffer exceeded maximum size. Resetting the buffer.', and then the recognition is cancelled before it even starts recognizing the voices. I also encountered the same error when using a microphone for recognition. If you maintain a silent environment for about 1 minute and 9 seconds before starting recognition, the recognition will also be cancelled and the same error will occur.
To Reproduce
Using the audio file I provided, modify the code in cognitive-services-speech-sdk/tree/master/samples/swift/ios/conversation-transcription to replace it with the one I provided. The main configuration is to set autoDetectConfig for two languages using SPXConversationTranscriber.
Expected behavior
Can recognize the entire audio file properly without throwing the error 'Due to service inactivity, the client buffer exceeded maximum size. Resetting the buffer.
Version of the Cognitive Services Speech SDK
I use Cocoapods install library, version is MicrosoftCognitiveServicesSpeech-iOS 1.37.0 in Podfile.lock file
Platform, Operating System, and Programming Language
OS: iOS
Hardware - iPhone 12 Pro ARM64
Programming language: Swift
Additional context
Error messages
Due to service inactivity, the client buffer exceeded maximum size. Resetting the buffer.
IN ORDER TO ASSIST YOU, PLEASE PROVIDE THE FOLLOWING:
speech_log.log
A stripped down, simplified version of your source code that exhibits the issue. Or, preferably, try to reproduce the problem with one of the public samples in this repository (or a minimally modified version of it), and share the code. I used the example below, and only modified this part of the code, and replacing the input audio file. https://github.com/Azure-Samples/cognitive-services-speech-sdk/tree/master/samples/swift/ios/conversation-transcription `
`
If relevant, a WAV file of your input audio. 21-01-28_ch1.wav.zip
Additional information as shown below
Describe the bug
I found that when autoDetectConfig is set to more than two languages, the recognition of the audio I provided, which starts with one and a half minutes of silence followed by human voices, always results in the error 'Due to service inactivity, the client buffer exceeded maximum size. Resetting the buffer.', and then the recognition is cancelled before it even starts recognizing the voices. I also encountered the same error when using a microphone for recognition. If you maintain a silent environment for about 1 minute and 9 seconds before starting recognition, the recognition will also be cancelled and the same error will occur.
To Reproduce
Using the audio file I provided, modify the code in cognitive-services-speech-sdk/tree/master/samples/swift/ios/conversation-transcription to replace it with the one I provided. The main configuration is to set autoDetectConfig for two languages using SPXConversationTranscriber.
Expected behavior
Can recognize the entire audio file properly without throwing the error 'Due to service inactivity, the client buffer exceeded maximum size. Resetting the buffer.
Version of the Cognitive Services Speech SDK
I use Cocoapods install library, version is MicrosoftCognitiveServicesSpeech-iOS 1.37.0 in Podfile.lock file
Platform, Operating System, and Programming Language
Additional context