csdcorp / speech_to_text

A Flutter plugin that exposes device specific text to speech recognition capability.
BSD 3-Clause "New" or "Revised" License
348 stars 217 forks source link

Flutter Web, Android Chrome, speech to text returns spoken words multiple times #467

Closed longtimedeveloper closed 4 months ago

longtimedeveloper commented 4 months ago

Hi,

I have a Flutter web app that runs flawlessly on Windows Chrome.

When I run the same app on my Android Galaxy S20 FE, I can speak 5 words but speech_to_text returns 5, 10, 15, or 20 words. It repeats them. Some words are repeated 10 times.

Is there anything I can do to overcome this?

Thank you!

sowens-csd commented 4 months ago

hmm...that's distressing. Which version of STT are you using? Could you try 6.3.0 and see if it has the same behaviour? I made a change recently because I had misinterpreted the Web speech API. That change could easily lead to duplicates if the web speech API is implemented differently (incorrectly?) on different devices.

longtimedeveloper commented 4 months ago

@sowens-csd I have created an Android version of the app and ran on my same Android device. STT words perfect there. Well except for the horrible Android support for the pause delay. The Android pause delay is why I chose to create a web app.

I will try 6.3.0 tonight and let you know.

longtimedeveloper commented 4 months ago

@sowens-csd I just tried 6.3.0. It did not work at all. No words recognized, ever. SpeechRecognitionResult result.recognizedWords is always an empty string after speaking.

sowens-csd commented 4 months ago

Thanks for trying it. Does the example app also show duplicates on the Galaxy Android Chrome? Do you have access to another android mobile device to test on?

longtimedeveloper commented 4 months ago

@sowens-csd no I only have this device. I am currently a volunteer in Ukraine near the frontlines. If you can upload the sample app to a website, and send me the link, I will try it for you, no problem.

I really hope you can sort out the duplicates issue. Sometimes I get 2 or 3 duplicates, then other times 20 or more duplicates for the same words spoken.

Thank you very much!

sowens-csd commented 4 months ago

Can you give me a couple of examples of what you say vs. what you receive? I'd like to understand what happens for a single word vs. a phrase. If you say one word, I assume you get multiple copies of that word? If you say a phrase do you get each word in the phrase duplicated or the whole sentence duplicated or other? Also, are their alternates in the results or just a single value with duplicates in that value?

For example:

Say: hello Receive: ["hello hello hello"]

Say: where are you Receive: ["where are you where are you where are you"]

longtimedeveloper commented 4 months ago

@sowens-csd no problem, very happy to provide screen shots and the word or words I spoke to you tomorrow.

I have been up since 0200 and it's 2100 now. They keep firing artillery at us too. Another crazy night in Kherson.

First thing tomorrow morning. Thank you!!

sowens-csd commented 4 months ago

Not sure if this qualifies as good news but I have managed to reproduce the issue on a Galaxy S9. I'll have a look at it and see what's happening.

Hoping for a more peaceful time for you, no one should have to go through that.

sowens-csd commented 4 months ago

Started looking into this, some useful information, looks like there is a known issue with Chrome on Android duplicating words though there isn't much info about it. However, some things to explore. https://stackoverflow.com/questions/35112561/speech-recognition-api-duplicated-phrases-on-android

longtimedeveloper commented 4 months ago

@sowens-csd if I speak a single word, it does not get duplicated.

In this session I spoke "hello" twice.

spoke hello twice

In this session, I spoke "glory" three times.

spoke glory 3 times two

In this session, I spoke "glory" three times.

spoke glory three times

In this session, I spoke "glory" three times.

went crazy

sowens-csd commented 4 months ago

Thanks a lot for this. I'm seeing similar results, which is good as I can more easily troubleshoot. I originally made the change in response to PR #436. The issue there was that on the desktop when the user pauses speaking the recognition result splits the response into two phrases. So the plugin now accounts for that and concatenates multiple phrases into a single result.

On Android web the interaction is different. It doesn't seem to split into separate phrases on a pause and it seems to be returning multiple phrases with duplicates of the spoken content as it processes. The good news is that the correct complete result does seem to be in the set of phrases. I just need to find a clean way to throw out the junk.

sowens-csd commented 4 months ago

I just pushed some changes to the repo that I think might resolve it. When you have a chance could you try the repo version? Here's the initialization change you can use for mobile browsers.

      var hasSpeech = await speech.initialize(
          onError: errorListener,
          onStatus: statusListener,
          options: [SpeechToText.webDoNotAggregate]);

Since the mobile and desktop browsers respond differently and the plugin can't tell which it is running on you have to tell it. I think that using something like device_info_plus you could find out the browser type and optionally include this? Let me know how it works on your device.

longtimedeveloper commented 4 months ago

@sowens-csd OK. I can run the sample in the repo on my device.

longtimedeveloper commented 4 months ago

@sowens-csd I tried the example app. It works on Desktop Chrome.

However, on the Chrome Android, after pressing the Start button, nothing happens and it returns to its available state. In other words, it will not take input.

I have tried everything I can think of, but I can't get it to work.

Yes for the website on Chrome, I added in the new setting options: [SpeechToText.webDoNotAggregate]);

sowens-csd commented 4 months ago

How odd. I tried it on Chrome Android and got good results. You're getting no errors? I don't think what I did would affect whether it takes results or not, just how it reports them.

What happens if you don't set that option?

longtimedeveloper commented 4 months ago

@sowens-csd with or without the option I'm getting the same results. Maybe something else caused the problem. There is no way for me to debug a website on a device. I have to run in release mode.

If it is working for you, maybe best to publish the package and I'll use the package and get the same results you are getting.

I did update the flutter and flutter SDK in the pubspec.yaml file to the latest versions of Flutter that was just release. I noticed the example app has older versions in the pubspec.yaml. Not sure if this is the source of the problem.

I have no way of seeing errors on my Android device, so I am not sure what is happening.

I wish I could be more help.

longtimedeveloper commented 4 months ago

name: speech_to_text description: A Flutter plugin that exposes device specific speech to text recognition capability. version: 6.4.1 homepage: https://github.com/csdcorp/speech_to_text

environment: sdk: '>=3.2.0 <4.0.0' flutter: '>=3.10.0'

dependencies: flutter: sdk: flutter speech_to_text_platform_interface: ^2.1.0 speech_to_text_macos: ^1.0.2 json_annotation: ^4.0.0 clock: ^1.0.1 pedantic: ^1.9.2 flutter_web_plugins: sdk: flutter meta: ^1.1.7 js: ^0.6.3

dev_dependencies: flutter_test: sdk: flutter build_runner: ^2.4.4 json_serializable: ^6.7.0 fake_async: ^1.3.1 mockito: ^5.4.1 plugin_platform_interface: ^2.1.4 flutter_lints: ^3.0.0

flutter: plugin: platforms: android: package: com.csdcorp.speech_to_text pluginClass: SpeechToTextPlugin ios: pluginClass: SpeechToTextPlugin web: pluginClass: SpeechToTextPlugin fileName: speech_to_text_web.dart macos: default_package: speech_to_text_macos

longtimedeveloper commented 4 months ago

name: speech_to_text_example description: Demonstrates how to use the speech_to_text plugin. version: 1.1.0 publish_to: 'none'

environment: sdk: '>=3.2.0 <4.0.0'

dependencies: flutter: sdk: flutter

speech_to_text: path: ../ provider: ^6.0.5

dev_dependencies: flutter_test: sdk: flutter flutter_lints: ^3.0.0

The following section is specific to Flutter.

flutter: uses-material-design: true

assets:

longtimedeveloper commented 4 months ago

I opened up remote Chrome developer tools. Each time I press the Start button, I get the error that you see in the image.

It is probably something with the example app and not your package.

I have to go to a funeral for our teammate that was killed this week in an artillery attack.

Capture

sowens-csd commented 4 months ago

I pushed 6.5.0 to pub.dev. Please let me know if you have a chance to try it.

My condolences on the death of your teammate.

longtimedeveloper commented 4 months ago

I will do this first thing in the morning. Thank you very much for outstanding assistance. Much appreciated.

longtimedeveloper commented 4 months ago

@sowens-csd The update and adding the options: [SpeechToText.webDoNotAggregate] works perfectly on Android Chrome.

Thank you very much.