csdcorp / speech_to_text

A Flutter plugin that exposes device specific text to speech recognition capability.
BSD 3-Clause "New" or "Revised" License
351 stars 218 forks source link

SpeechRecognition does not work as expected on iOS #115

Closed ebeyabraham closed 3 years ago

ebeyabraham commented 3 years ago

SpeechRecognitionResult.finalResult does not detect if speech is finished or not in iOS. It continues to listen to speech even if no word is detected up till listenFor duration. I tried to fix it using 'pauseFor' parameter, but that also does not seem to help identify if the speech is finished or not.

The error can be replicated in the example app provided with the package.

sowens-csd commented 3 years ago

iOS doesn't timeout in the same way as Android. What value are you using for pauseFor to get the result? If I use a pauseFor of 3 seconds with a listenFor of 60 seconds I do see the example app stop after ~3 seconds of silence.

ebeyabraham commented 3 years ago

I can make the example app stop listening after using pauseFor, but SpeechRecognitionResult.finalResult is still false at the end. I want to call another function at the end of listening, and SpeechRecognitionResult.finalResult seems to be the only flag I can use for to detect if the result is final or not.

oleksandrtaran commented 3 years ago

@MrGrayCode yes, it's true that the SpeechRecognitionResult.finalResult is always false at the end of recognition. At the same time, on Android, it works consistently with true returned with the last recognition result. I developed a complete app (already in the store) relying on this logic and it's possible to work around "finalResult" issue on iOS. I used listenFor, pausedFor and a bunch of other timers to work around. Basically, you can use onError and onStatus listeners from initialize to create a reliable solution for iOS.

But yes, it would be great to have consistent iOS behavior for finalResult as it is on Android.

ebeyabraham commented 3 years ago

@oleksandrtaran my current workaround is using SpeechRecognitionResult.confidence which returns a value > 0.0 only after listening is complete.

sowens-csd commented 3 years ago

Thanks for explaining. I'll look into that.

FlutterClutter commented 3 years ago

I have the same issue in my production app. Do you think a fix within a week is realistic @sowens-csd ?

sowens-csd commented 3 years ago

I'm looking at it now. I'll try to provide another update today. Right now I don't understand why stop isn't causing the final result to be true. Once I do I'll have a better idea of how complicated it would be to fix it.

sowens-csd commented 3 years ago

Can anyone confirm if you have speech start / stop sounds defined for iOS? There appears to be an error in the case where you do not have any sounds.

sowens-csd commented 3 years ago

Currently it looks like it has something to do with the threading model. From the example app when I use the 'stop' button I get finalResult true. If I set a pauseFor of 3 seconds and let it elapse then finalResult is false. The only difference that I can think of so far, since both call the same channel method, is that the pauseFor invokes it on a background thread.

FlutterClutter commented 3 years ago

I have no start/stop sounds defined for iOS, if that helps

sowens-csd commented 3 years ago

Yes, definitely helpful, thanks.

Okay, I think I see what's happening. The issue seems to be that the speech recognizer only gives a final result when it is done processing. Even clicking the stop button manually doesn't give final results if there is a pause between the end of speech and when you click the stop button. The speech recognizer must think that there's indeterminate speech in that pause which means that it can't give a final result. If you click stop right after a recognized word then you get a final result.

Given that behaviour I think I'm going to create a synthetic final result. In other words I'll set finalResult to true whenever it is the last result STT is going to pass back to its client, whether or not the underlying speech subsystem thinks it is final. Let me know if you see any issues with that approach.

sowens-csd commented 3 years ago

Okay, there is an experimental fix in main now. If anyone has time please give it a try. There should be no code changes required. The behaviour changes are these:

FlutterClutter commented 3 years ago

Thank you for providing this experimental fix. The statusListener now successfully outputs notListening (which it did not before), but unfortunately it still takes between 4 and 6 seconds until it receives notListening status after you stop talking

sowens-csd commented 3 years ago

What value are you using in pauseFor to get that result?

sowens-csd commented 3 years ago

2.4.0 is now available with this fix.