donglu-laxis commented 1 month ago

IN ORDER TO ASSIST YOU, PLEASE PROVIDE THE FOLLOWING:

Speech SDK log taken from a run that exhibits the reported issue.

A stripped down, simplified version of your source code that exhibits the issue. Or, preferably, try to reproduce the problem with one of the public samples in this repository (or a minimally modified version of it), and share the code. I used the example below, and only modified this part of the code, and replacing the input audio file. https://github.com/Azure-Samples/cognitive-services-speech-sdk/tree/master/samples/swift/ios/conversation-transcription `

if let speechConfig {
    let documentsDirectoryPathString = NSSearchPathForDirectoriesInDomains(.documentDirectory, .userDomainMask, true).first!
    let documentsDirectoryPath = NSURL(string: documentsDirectoryPathString)!
    let logFilePath = documentsDirectoryPath.appendingPathComponent("speech_log.log")
    speechConfig.setPropertyTo(logFilePath!.absoluteString, by: SPXPropertyId.speechLogFilename)
}

do {
    try self.autoDetectConfig = SPXAutoDetectSourceLanguageConfiguration(
        [
            "en-US",
            "zh-CN"
        ]
    )
} catch {
    print("error \(error) happened")
    autoDetectConfig = nil
}

self.transcriber = try! SPXConversationTranscriber(speechConfiguration: self.speechConfig!, autoDetectSourceLanguageConfiguration: self.autoDetectConfig!, audioConfiguration: self.audioConfig!)

`

If relevant, a WAV file of your input audio. 21-01-28_ch1.wav.zip
Additional information as shown below

Describe the bug

I found that when autoDetectConfig is set to more than two languages, the recognition of the audio I provided, which starts with one and a half minutes of silence followed by human voices, always results in the error 'Due to service inactivity, the client buffer exceeded maximum size. Resetting the buffer.', and then the recognition is cancelled before it even starts recognizing the voices. I also encountered the same error when using a microphone for recognition. If you maintain a silent environment for about 1 minute and 9 seconds before starting recognition, the recognition will also be cancelled and the same error will occur.

To Reproduce

Using the audio file I provided, modify the code in cognitive-services-speech-sdk/tree/master/samples/swift/ios/conversation-transcription to replace it with the one I provided. The main configuration is to set autoDetectConfig for two languages using SPXConversationTranscriber.

Expected behavior

Can recognize the entire audio file properly without throwing the error 'Due to service inactivity, the client buffer exceeded maximum size. Resetting the buffer.

Version of the Cognitive Services Speech SDK

I use Cocoapods install library, version is MicrosoftCognitiveServicesSpeech-iOS 1.37.0 in Podfile.lock file

Platform, Operating System, and Programming Language

OS: iOS
Hardware - iPhone 12 Pro ARM64
Programming language: Swift

Additional context

Error messages Due to service inactivity, the client buffer exceeded maximum size. Resetting the buffer.

donglu-laxis commented 1 month ago

I'm not sure if my problem is related to these issues.

1989 #2059 #2234

github-actions[bot] commented 2 weeks ago

This item has been open without activity for 19 days. Provide a comment on status and remove "update needed" label.

Azure-Samples / cognitive-services-speech-sdk

one and a half minutes of silence followed by human voices always results in the error 'Due to service inactivity, the client buffer exceeded maximum size. Resetting the buffer.' #2386

1989 #2059 #2234