csdcorp / speech_to_text

A Flutter plugin that exposes device specific text to speech recognition capability.
BSD 3-Clause "New" or "Revised" License
351 stars 218 forks source link

Timeout is resulting in "error_no_match" instead of "error_timeout" #114

Open benzai10 opened 3 years ago

benzai10 commented 3 years ago

This started to happen on Android OS 9 and 10 (only ones I checked) since the version 2.3.0. I don't know if it's related to #82, I can reproduce it with the sample application code.

sowens-csd commented 3 years ago

One thing I've seen that causes that is having the volume up on the device. On some devices it tries to recognize the start of recognition sound and fails. Have you tried it with the volume down or off?

benzai10 commented 3 years ago

Thanks for the answer but to make it clear, it's not an issue that the recognition sound would interfere.. The timeout behaves like expected, but after 4 seconds (the default timeout duration), an error gets triggered (as expected), but the error message is "error_no_match" (not expected), I would rather expect the error message "error_timeout". It's crucial for my application because I need to make a distinction if the speech input was not recognized (error_no_match), or the time has passed without any speech input (error_timeout).

sowens-csd commented 3 years ago

This didn't happen before the 2.3.0 release?

benzai10 commented 3 years ago

This didn't happen before the 2.3.0 release?

In the version prior to v2.3.0, I received separate error messages ("error_no_match" and "error_timeout").

benzai10 commented 3 years ago

Do you have any updates on this?

sowens-csd commented 3 years ago

Thanks for the reminder, looking into it today. I'll update again soon.

sowens-csd commented 3 years ago

I have reproduced the error on the example. Looking for cause now.

benzai10 commented 3 years ago

Great, really appreciate your efforts! Thanks a lot.

sowens-csd commented 3 years ago

I don't see any reason this should have changed. As far as I can tell the code is the same as it was in previous versions but you're right, the behaviour has definitely changed. I never see a timeout error in my testing, even with headphones which should definitely remove the chance that it's hearing its own beep. Currently I think that Android or the Google Speech services underlying it has slipstreamed in some change that has affected the behaviour. My current thought for a mitigation is to watch the decibels and if they stay below a certain level then return a timeout rather than a no match. It's a bit of a hack but might help with your issue until there is a fix. I could also include the time in the calculation, so if it is at the expected timeout and the decibels are low then it was probably a timeout. Unfortunately I'm not sure that the timeout is consistent across devices, on the Samsung I'm currently testing with it is 5 seconds but you'd said 4 in your testing.
Any thoughts on whether that would work for your case?

benzai10 commented 3 years ago

Thanks for sharing your results from your investigations. At the moment, I have a workaround in place that is using a stopwatch to measure the elapsed time to determine if it's a timeout error or not when the "error_no_match" occurs. I guess we have to live with this hack for the moment. Is there any way to corroborate your hunch that the cause lies one layer further down at Android OS or Google Speech service level?

sowens-csd commented 3 years ago

I'll post something on the Android issue list but given how long our other confirmed issue has been open I'm not optimistic. The problem is that there's no actual public spec of what those services should do afaik, just an outline based on its behaviour. It makes it very hard to know if they've deviated from expectations.

What value are you using for your stopwatch? I think the decibel threshold could be a decent approach, unfortunately I'm not sure if the decibels are consistent across different devices either.

sowens-csd commented 3 years ago

I just committed an experimental change that uses the decibels to try to detect a timeout and changes the returned error code. Could you try it out and let me know if it's working on your device?

benzai10 commented 3 years ago

Okay, understand. The current value I use are 4 seconds being aware that different devices have different behaviors. Unfortunately, I also have to learn on the fly what the differences among the devices are... The behavior of the built-in earcons for example also vary on different devices.

benzai10 commented 3 years ago

Okay, I will check your experimental change

benzai10 commented 3 years ago

I checked your experimental change but it didn't change anything (still get only the "error_no_match" error message). I use

print('speech error: ${errorNotification.errorMsg}');

to display the error message.

I did the following:

In pubspec.yaml:

speech_to_text:
  git:
    url: https://github.com/csdcorp/speech_to_text.git
    ref: main

In the console: flutter clean flutter pub get

Devices I checked with:

sowens-csd commented 3 years ago

Yes that pubspec looks right. If you have time could you try it with debugLogging set to true on the initialize and let me know what the rms values are in the log during what you think should be a time-out?

benzai10 commented 3 years ago

Interesting, when I activate the debugLogging, I get the timeout error as expected. The rms values are like following:

D/SpeechToTextPlugin(26637): rmsDB -2.0 / 6.5199995
D/SpeechToTextPlugin(26637): Stop listening
D/SpeechToTextPlugin(26637): Notify listening
D/SpeechToTextPlugin(26637): Notify listening done
D/SpeechToTextPlugin(26637): Stop listening done
D/SpeechToTextPlugin(26637): rmsDB -2.0 / 6.5199995
D/SpeechToTextPlugin(26637): rmsDB -2.0 / 6.5199995
D/SpeechToTextPlugin(26637): Error 7 after start at 4160 -2.0 / 6.5199995

As soon as I remove the debugLogging though, I don't get the timeout error anymore. Hope this helps.

sowens-csd commented 3 years ago

Definitely helpful. I must have done some of the calls inside the debug condition or something. I’ll check, thanks!

sowens-csd commented 3 years ago

I can't reproduce that result. It works for me whether in debug or not. I did notice that your rms level was very close to the threshold. I just committed a change to move the threshold to 9. Could you try that version and see if you're getting the same result?

benzai10 commented 3 years ago

I tried the latest version with following result: I get always "error_no_match", regardless having debugLogging on or off. The rmsDB values are like follows:

On a Pixel 3a:

D/SpeechToTextPlugin(31348): rmsDB -2.0 / 10.0

On a LG G6:

D/SpeechToTextPlugin(31348): rmsDB -2.0 / 8.5 ... 10.0 *)

*) On the Pixel 3a, the last value seems to not change despite different surrounding noise levels, on the LG G6, this value fluctuates between values from 8.5 to 10.0 (I don't know much about the rmsDB value, so I don't really know how to interpret these values...)

I couldn't get a "timeout_error" anymore.

sowens-csd commented 3 years ago

Did you try it with headphones in? The rmsDB value is the decibels seen during the listen session. The theory on this change was to change the error_no_match to an error_timeout if the sound level was too low to have been speech. If there's too much variability in a 'low' sound reading from various devices it won't work so it looks like this approach is not going to work. I'll look into trying to find the timeout value from the device, then I could use that as a more reliable signal.

As I've been thinking about it and trying to reproduce the behaviour I'm not sure how the underlying Android speech lib differentiates between the two cases. If it has sound coming in on the microphone then I'd assume it would have to give an error_no_match. The only way I can see a timeout being detected would be if there was basically no sound during the timeout period. I suppose there could be some pattern matching it can do to determine that despite hearing sound that the sound is not speech and therefore it should just report a timeout.

benzai10 commented 3 years ago

Just tried it with headphones in, didn't change anything.

From my perspective, the error_timeout shouldn't be coupled with sound level. As I understand, the speech engine tries to recognize a word during the time period until the timeout occurs. So after this time period there are two possible outcomes in my opinion: we get a recognized word array (with its probability scores) or there was a sound level change that was interpreted by the speech engine as a speech input but couldn't be recognized (for example a hand clap). This throws a error_no_match and as far as I'm concerned, this works as expected. When none of these two cases happen, a error_timeout should occur. I'm concerned that if the timeout detection is tied to sound level changes, a more unpredictable behavior will be the result.

benzai10 commented 3 years ago

I stay corrected. On the Pixel 3a, I get the error_timeout! But only, with the headphones plugged in. My application use case requires speech recognition without headphones though...

2nd edit: It also works on my other phone, the LG G6 (but also only with the headphones in), I guess you have changed something?

sowens-csd commented 3 years ago

I still don't have a good solution to this and am still not sure why the Android behaviour has changed. I'm still trying to come up with a decent work around.

benzai10 commented 3 years ago

No problem, let's see how the Android behavior will evolve. At the moment, I'm fine with the above mentioned stopwatch-based workaround I implemented in my project.

searleser97 commented 1 year ago

Is there any way that I can catch this error and restart the "listening" ?