Azure-Samples / cognitive-services-speech-sdk

Sample code for the Microsoft Cognitive Services Speech SDK
MIT License
2.95k stars 1.86k forks source link

method to stop the audio recognition StopAudioRecognText #1453

Closed vijayprakash1 closed 2 years ago

vijayprakash1 commented 2 years ago

Hi Everyone

We are trying to implement infinite streaming speech-to-text transcription. To make the transcription infinite we are using the startContinuousReconition method so now there are two methods we are using 

1). AddRecogniningEventHandler gives Interim results 2). AddRecognizedEventhandler  gives the final recognition result

But there is no method to stop the audio recognition StopAudioRecognText. found in the code

Once we stop the transcription we are getting the following error 

*** Terminating app due to uncaught exception 'com.apple.coreaudio.avfaudio', reason: 'required condition is false: nullptr == Tap()

Please let me know if there is any solution to this.

dargilco commented 2 years ago

Hi @vijayprakash1, I'm Darren from the Speech SDK team. I have a few questions: What programming language are you using? What operating system? (I assume iOS or MacOS based on the apple name space). This is real-time recognition from microphone input, correct?

Objective-C has startContinousRecognition and stopContinuousRecognition methods: https://docs.microsoft.com/objectivec/cognitive-services/speech/spxspeechrecognizer#startcontinuousrecognition

Not clear to me what you mean by "no method to stop the audio recognition StopAudioRecognText. found in the code". You can stop recognition using stopContinuousRecognition.

It will help if you share some of your source code and give more details about your scenario. We may also need an SDK log taken while you get the mentioned error. See here for instructions on how to get the log file: https://docs.microsoft.com/azure/cognitive-services/speech-service/how-to-use-logging

vijayprakash1 commented 2 years ago

Hi @dargilco, Thanks for your response and apologies for the delayed response, below is our response to your questions:

Our use case Scenario: We are building a transcription app where you transcribe audio input in real-time (spoken words in real-time). We have three different ASRs (Azure Speech to Text, Google Speech to Text, and IBM Speech to text) in our app, and users can use any based on their preference.

Current Problem/Issues we are facing: Now, current problem is that Microsoft Azure is not working efficiently in lower versions of iOS Version: 12.5.3 or lower versions of iPhone devices. When we are trying to run the app on lower version devices, we are getting the below error on the console

**- Speech recognizer object deallocated.

We are using the exact code available on GitHub; we haven't changed anything on the code for the Transcription. We also tried getting the logs as you explained but no logs have been recorded.

Please let me know if the current SDK is supported on the older iOS, or what could be a potential issue.

Thanks! Waiting for your response.

jhakulin commented 2 years ago

@vijayprakash1 Thanks for the report. The first problem you mentioned seems to be coming from AVAudioEngine

** Terminating app due to uncaught exception 'com.apple.coreaudio.avfaudio', reason: 'required condition is false: nullptr == Tap()

However not sure if that is still an issue (?) as based on your second message "current problem is that Microsoft Azure is not working efficiently in lower versions of iOS Version: 12.5.3 or lower versions of iPhone devices"

Could you please clarify more with examples what "Microsoft Azure is not working efficiently" means?

Related to "Speech recognizer object deallocated or Audio config object deallocated" messages. These are not directly errors, but SDK side traces to indicate that in your app, speech recognizer, audio config objects gets deallocated e.g. due to automatic resource management or the object is set to nil. We should probably remove that trace from the dealloc methods to not cause noise in release builds.

vijayprakash1 commented 2 years ago

Thanks for the response @jhakulin,

Regarding the Could you please clarify more with examples what "Microsoft Azure is not working efficiently" means?

When I run the app on the latest iOS version like iPhone 12, it is working fine. I am getting the transcription data accurately in form of interim and final. But in the lower iOS version like I mentioned above, transcription is not working at all, when I speak, I am not getting any transcription results from the API and getting the "Speech recognizer object deallocated or Audio config object deallocated" error. So, I wanted to understand if the current API is only working on the higher iOS versions or what is the minimum version required.

jhakulin commented 2 years ago

@vijayprakash1 Could you confirm that iPhone 6, iOS Version: 12.5.3 has the problems you mentioned? Could you provide also Speech SDK log for the problem? https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/how-to-use-logging

Speech SDK for iOS should work with 12.5.3 software version as it is built with deployment target 9.3. https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/speech-sdk?tabs=ios%2Cubuntu%2Cios-xcode%2Cmac-xcode%2Candroid-studio

jhakulin commented 2 years ago

@vijayprakash1 Sorry I gave you wrong link in the earlier comment for enabling Speech SDK logs. Could you please try to get traces from Speech SDK for the iPhone 6 issue ? https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/how-to-use-logging

vijayprakash1 commented 2 years ago

Hi @jhakulin

I have tried the link you shared but, unfortunately, I am not able to get the SDK logs data.

Please see our snap code. Also, let me know where these files can be located in the application? Maybe I am looking at the wrong place.

file 2 file 3 file1

jhakulin commented 2 years ago

@vijayprakash1 We have made further verification with earlier iOS versions for just released 1.21.0 Speech SDK release for iOS and using versions from iPhone 5 (iOS 10.3.4 version) to iPhone 8 (various iOS versions from 12.x to 15.x) the speech SDK functionality has been verified working. In 1.21.0 Speech SDK release we added support for armv7 architecture, so in case you have some devices using the older armv7 architecture, those should work as well. Could you please upgrade the SDK on your side to 1.21.0 version and let us know if you see any problems ?

If you still have problems with the latest SDK, please try to get us a log. Based on your attached picture, I do not fully see what propertyname is used but I assume that is correct. You may need to look for Apple file management details on how to locate documents directory, https://developer.apple.com/library/archive/documentation/FileManagement/Conceptual/FileSystemProgrammingGuide/FileSystemOverview/FileSystemOverview.html

vijayprakash1 commented 2 years ago

Hi @jhakulin

Please find the attached code for your reference.

We have tried all the possible solutions suggested by you but didn't work out.

Requesting please run the code from your side and see if you can provide a solution to our problem.

Here is the code link

jhakulin commented 2 years ago

@vijayprakash1 Please see attached sample code (ViewController.zip) which does continuous transcription in Swift. This sample is based on https://github.com/Azure-Samples/cognitive-services-speech-sdk/tree/master/samples/swift/ios/from-external-microphone as I got information you have used that as basis for your application. ViewController.zip

When you run that sample, you should be able to start and stop transcription using Start/Stop button. Let us know if this helps

vijayprakash1 commented 2 years ago

Hi @jhakulin,

Thank you so much for the support, Appreciate it.

Our developer has started working on it, I will update you asap.

jhakulin commented 2 years ago

Closing the issue as resolved based on the information in email communication.