csdcorp / speech_to_text

A Flutter plugin that exposes device specific text to speech recognition capability.
BSD 3-Clause "New" or "Revised" License
351 stars 218 forks source link

[Bug] word results keep previous words even though "final: true" #127

Closed fenchai23 closed 3 years ago

fenchai23 commented 3 years ago

I have been playing around with the lib and at first, everything was working as expected, after playing a bit later it seems there is a bug that makes speech.listen() keep previous words even if final: true and speech.cancel() is run.

I'm not sure if there is simple a lot of bugs here or I am missing something I spoke: 1 then 2 then 3 and each time after I spoke I did a speech.stop()

take a look at the log here...

I/flutter (20658): listening
I/flutter (20658): SpeechRecognitionResult words: [SpeechRecognitionWords words: ,  confidence: -1.0], final: false
I/flutter (20658): SpeechRecognitionResult words: [SpeechRecognitionWords words: ,  confidence: -1.0], final: false
I/flutter (20658): SpeechRecognitionResult words: [SpeechRecognitionWords words: una,  confidence: -1.0], final: false
I/flutter (20658): notListening
I/flutter (20658): SpeechRecognitionResult words: [SpeechRecognitionWords words: una,  confidence: -1.0], final: true
I/flutter (20658): listening
I/flutter (20658): error_busy - true
I/flutter (20658): SpeechRecognitionResult words: [SpeechRecognitionWords words: ,  confidence: -1.0], final: false
I/flutter (20658): SpeechRecognitionResult words: [SpeechRecognitionWords words: ,  confidence: -1.0], final: false
I/flutter (20658): SpeechRecognitionResult words: [SpeechRecognitionWords words: 12,  confidence: -1.0], final: false
I/flutter (20658): SpeechRecognitionResult words: [SpeechRecognitionWords words: ,  confidence: -1.0], final: false
I/flutter (20658): SpeechRecognitionResult words: [SpeechRecognitionWords words: 123,  confidence: -1.0], final: false
I/flutter (20658): notListening
I/flutter (20658): SpeechRecognitionResult words: [SpeechRecognitionWords words: 123,  confidence: -1.0], final: true

(btw listenFor and pauseFor causes a lot of troubles like always giving me error_busy and notListening so I removed them if you have a solution for it please let me know. It seems there has been a lot of people complaining about it in issues too.)

here is the full code:

import 'package:auto_size_text/auto_size_text.dart';
import 'package:avatar_glow/avatar_glow.dart';
import 'package:flutter/material.dart';
import 'package:hyl/screens/pages/catalogPageTree/providers/catalogDataProvider.dart';
// import 'package:hyl/services/showOverlay.dart';
import 'package:permission_handler/permission_handler.dart';
import 'package:speech_to_text/speech_recognition_error.dart';
import 'package:speech_to_text/speech_to_text.dart';
import 'package:provider/provider.dart';

class VoiceInput extends StatefulWidget {
  @override
  _VoiceInputState createState() => _VoiceInputState();
}

class _VoiceInputState extends State<VoiceInput> {
  @override
  Widget build(BuildContext context) {
    return SpeechScreen();
  }
}

class SpeechScreen extends StatefulWidget {
  @override
  _SpeechScreenState createState() => _SpeechScreenState();
}

class _SpeechScreenState extends State<SpeechScreen> {
  final SpeechToText _speech = SpeechToText();
  bool _hasSpeech = false;
  List<LocaleName> _localeNames = [];
  String _currentLocaleId = "NA";
  String _preferredLocaleId = 'es_PA';
  bool _isListening = false;
  String _defaultText = 'Presiona el botón y comienza a hablar';
  String _listeningText = 'Escuchando...';
  String _text = 'Presiona el botón y comienza a hablar';
  double _confidence = 0;

  @override
  void initState() {
    super.initState();

    initSpeechState();
  }

  bool preferredlocaleIdExists(prefId) {
    for (final k in _localeNames) {
      if (k.localeId == prefId) return true;
    }
    return false;
  }

  Future<void> initSpeechState() async {
    _hasSpeech = await _speech.initialize(
      onStatus: (val) => statusListener(val),
      onError: (val) => errorListener(val),
    );

    if (_hasSpeech) {
      _localeNames = await _speech.locales();

      var systemLocale = await _speech.systemLocale();

      _currentLocaleId = preferredlocaleIdExists(_preferredLocaleId)
          ? 'es_PA'
          : systemLocale.localeId;
    }

    if (!mounted) return;

    setState(() {
      _hasSpeech = _hasSpeech;
    });
  }

  _switchLang(selectedVal) {
    setState(() {
      _currentLocaleId = selectedVal;
    });
    print(selectedVal);
  }

  @override
  Widget build(BuildContext context) {
    return Scaffold(
      appBar: AppBar(
        centerTitle: true,
        titleSpacing: 0,
        automaticallyImplyLeading: false,
        title: AutoSizeText(
          'Rec de Voz | ${(_confidence * 100.0).toStringAsFixed(1)}%',
          style: Theme.of(context).textTheme.headline6,
        ),
        actions: [
          Visibility(
            visible: (_text != _defaultText && _text != _listeningText)
                ? true
                : false,
            child: IconButton(
              icon: Icon(
                Icons.check_circle,
                color: Theme.of(context).primaryColor,
              ),
              onPressed: () {
                Navigator.of(context).pop();
                context
                    .read<CatalogDataProvider>()
                    .setCatalogSearchQuery(_text);
              },
            ),
          )
        ],
      ),
      floatingActionButton: AvatarGlow(
        animate: _isListening ? true : false,
        glowColor: Theme.of(context).primaryColor,
        endRadius: 75.0,
        duration: const Duration(milliseconds: 800),
        repeat: true,
        child: FloatingActionButton(
          backgroundColor:
              _isListening ? Colors.blueGrey : Theme.of(context).primaryColor,
          onPressed: _listen,
          child: Icon(!_hasSpeech
              ? Icons.mic_off
              : _isListening ? Icons.mic : Icons.mic_none),
        ),
      ),
      floatingActionButtonLocation: FloatingActionButtonLocation.centerFloat,
      body: SingleChildScrollView(
        reverse: true,
        child: Column(
          children: [
            _hasSpeech
                ? Center(
                    child: DropdownButton(
                      onChanged: (selectedVal) => _switchLang(selectedVal),
                      value: _currentLocaleId,
                      items: _localeNames
                          .map(
                            (localeName) => DropdownMenuItem(
                              value: localeName.localeId,
                              child: Text(
                                localeName.name,
                                style: TextStyle(
                                    color: Theme.of(context)
                                        .colorScheme
                                        .onPrimary),
                              ),
                            ),
                          )
                          .toList(),
                      icon: Icon(
                        Icons.language,
                        color: Theme.of(context).primaryColor,
                      ),
                      dropdownColor: Theme.of(context).primaryColor,
                    ),
                  )
                : Container(),
            Container(
              padding: EdgeInsets.fromLTRB(30.0, 30.0, 30.0, 150.0),
              child: Text(
                _text,
                style: TextStyle(
                  fontSize: 32,
                  fontWeight: FontWeight.w400,
                ),
                textAlign: TextAlign.center,
              ),
            ),
          ],
        ),
      ),
    );
  }

  String parseListenedText(String text) {
    String modText = text;
    RegExp re = new RegExp(r'\d+ \d+');
    Iterable matches = re.allMatches(text);
    matches.forEach((match) {
      var ogMatch = match.group(0);
      modText =
          modText.replaceAll(ogMatch, match.group(0).replaceAll(' ', '-'));
    });
    return modText;
  }

  Future<void> _listen() async {
    if (await _hasMicPermissionGranted() == false) {
      Permission.microphone.request().then(
        (PermissionStatus perm) {
          if (perm == PermissionStatus.granted) {
            initSpeechState();
          }
        },
      );
    }
    if (!_isListening) {
      if (_hasSpeech) {
        setState(() {
          _isListening = true;
          _text = _listeningText;
        });

        _speech.listen(
          localeId: _currentLocaleId,
          // listenFor: Duration(seconds: 30),
          // pauseFor: Duration(seconds: 3),
          onResult: (val) => setState(() {
            print(val);
            _text = parseListenedText(val.recognizedWords);
            if (val.hasConfidenceRating && val.confidence > 0) {
              _confidence = val.confidence;
            }
          }),
        );
      }
    } else {
      setState(() {
        _isListening = false;
        _text = (_text == _listeningText) ? _defaultText : _text;
      });

      await _speech.stop();
    }
  }

  Future<bool> _hasMicPermissionGranted() async {
    PermissionStatus micStatus = await Permission.microphone.status;
    if (micStatus == PermissionStatus.granted) {
      return true;
    } else if (micStatus == PermissionStatus.permanentlyDenied ||
        micStatus == PermissionStatus.restricted) {
      openAppSettings();
      return false;
    } else {
      return false;
    }
  }

  void errorListener(SpeechRecognitionError error) {
    // _speech.cancel();

    setState(() {
      _isListening = false;

      if (_text == _defaultText || _text == _listeningText)
        _text = _defaultText;
    });

    print("${error.errorMsg} - ${error.permanent}");

    // ShowOverlay.snackbar(
    //   "${error.errorMsg} - ${error.permanent}",
    //   color: Colors.red,
    //   leadingType: 'negative',
    //   key: Key('error'),
    // );
  }

  void statusListener(String status) {
    if (status == 'notListening') {
      // _speech.cancel();

      setState(() {
        _isListening = false;

        if (_text == _defaultText || _text == _listeningText)
          _text = _defaultText;
      });
    }

    print(status);

    // ShowOverlay.snackbar(
    //   "$status",
    //   color: Colors.blue,
    //   leadingType: 'positive',
    //   key: Key('error'),
    // );
  }
}
sowens-csd commented 3 years ago

What device / OS are you using and which version of speech_to_text?

fenchai23 commented 3 years ago

What device / OS are you using and which version of speech_to_text?

OnePlus 7 Pro / Android 10 / As of this date the latest Version.

sowens-csd commented 3 years ago

How quickly are you speaking the '1', '2', '3'? I think you're likely running into the problem that on Android 10 stopping speech doesn't actually stop it. The Android team has it has a bug and are apparently working on it but I don't see an update on the issue lately. I'm tracking it in #69 and the Android issue I'm watching is here https://issuetracker.google.com/issues/158198432

It wouldn't hurt if you could add a like to that issue to give them more incentive to fix it.

fenchai23 commented 3 years ago

How quickly are you speaking the '1', '2', '3'? I think you're likely running into the problem that on Android 10 stopping speech doesn't actually stop it. The Android team has it has a bug and are apparently working on it but I don't see an update on the issue lately. I'm tracking it in #69 and the Android issue I'm watching is here https://issuetracker.google.com/issues/158198432

It wouldn't hurt if you could add a like to that issue to give them more incentive to fix it.

As I said previously, I tap on speech.listen() then say 1 then speech.stop() wait for a second or two then say 2... etc

It's just weird that the logs say final = true but the words keep parsing... I have to exit the page and come back in again for it to reset.

Maybe we could manually implement a restart?

About the bug report I am not seeing any like, only a star and pressing it says I'm no longer experiencing this issue so I have not starred.

sowens-csd commented 3 years ago

I'll continue to work on this issue but right now I don't have a workaround. I will explore whether it's possible to destroy the listener at a lower level than the stop method uses to see if I can find some way around the Android issue.

sowens-csd commented 3 years ago

If you have a chance could you try the android_destroy branch? Our discussion yesterday gave me an idea, which is to destroy the speech recognizer intent instead of using the stopListening/cancel methods that don't appear to work. It would mean that the recognizer would stop immediately. The difference between this and using the Android stopListening method is that stopListening finishes processing outstanding speech and reports final results before terminating. Destroy terminates immediately which means that it won't finish processing. Due to a change I made for a bug in iOS that meant it sometimes didn't return final results the speech_to_text library will generate a final if none appear after stop is called. That final will be the same as the last interim result returned from the OS. So, not quite as good as having a working stop but it would remove the delay. This version does both call stopListening, delays for 50 ms and then destroys the recognizer. On Android 9 that gives time for a final result, not sure what it will do on 10.

This version of the fix always destroys, a more complete implementation would probably either check OS version or accept a flag on the initialize method so that it would not destroy the recognizer on OS versions that work.

fenchai23 commented 3 years ago

alright, sorry, how do I import this branch?

sowens-csd commented 3 years ago

Use a git reference in pusbspec.yaml like so:

  speech_to_text:
    git: 
      url: https://github.com/csdcorp/speech_to_text.git
      ref: android_destroy
fenchai23 commented 3 years ago

alright I'll test this at once after work

fenchai23 commented 3 years ago

Use a git reference in pusbspec.yaml like so:

  speech_to_text:
    git: 
      url: https://github.com/csdcorp/speech_to_text.git
      ref: android_destroy

It seems it's giving me a compile error

FAILURE: Build failed with an exception.

* What went wrong:
Execution failed for task ':speech_to_text:compileDebugKotlin'.
> Compilation error. See log for more details

* Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. Run with --scan to get full insights.

* Get more help at https://help.gradle.org

BUILD FAILED in 32s
Launching lib\main.dart on GM1917 in debug mode...
e: C:\flutter\.pub-cache\git\speech_to_text-e2268385802386e2dba7bda89d784a777d6e970e\android\src\main\kotlin\com\csdcorp\speech_to_text\SpeechToTextPlugin.kt: (37, 12): Redeclaration: SpeechToTextErrors
e: C:\flutter\.pub-cache\git\speech_to_text-e2268385802386e2dba7bda89d784a777d6e970e\android\src\main\kotlin\com\csdcorp\speech_to_text\SpeechToTextPlugin.kt: (46, 12): Redeclaration: SpeechToTextCallbackMethods
e: C:\flutter\.pub-cache\git\speech_to_text-e2268385802386e2dba7bda89d784a777d6e970e\android\src\main\kotlin\com\csdcorp\speech_to_text\SpeechToTextPlugin.kt: (53, 12): Redeclaration: SpeechToTextStatus
e: C:\flutter\.pub-cache\git\speech_to_text-e2268385802386e2dba7bda89d784a777d6e970e\android\src\main\kotlin\com\csdcorp\speech_to_text\SpeechToTextPlugin.kt: (60, 12): Redeclaration: ListenMode
e: C:\flutter\.pub-cache\git\speech_to_text-e2268385802386e2dba7bda89d784a777d6e970e\android\src\main\kotlin\com\csdcorp\speech_to_text\SpeechToTextPlugin.kt: (67, 11): Conflicting declarations: public const val pluginChannelName: String, public const val pluginChannelName: String
e: C:\flutter\.pub-cache\git\speech_to_text-e2268385802386e2dba7bda89d784a777d6e970e\android\src\main\kotlin\com\csdcorp\speech_to_text\SpeechToTextPlugin.kt: (71, 14): Redeclaration: SpeechToTextPlugin
e: C:\flutter\.pub-cache\git\speech_to_text-e2268385802386e2dba7bda89d784a777d6e970e\android\src\main\kotlin\com\csdcorp\speech_to_text\SpeechToTextPlugin.kt: (127, 44): Overload resolution ambiguity: 
public const val pluginChannelName: String defined in com.csdcorp.speech_to_text in file SpeechToTextPlugin.kt
public const val pluginChannelName: String defined in com.csdcorp.speech_to_text in file SpeechToTextPlugin.kt
e: C:\flutter\.pub-cache\git\speech_to_text-e2268385802386e2dba7bda89d784a777d6e970e\android\src\main\kotlin\com\csdcorp\speech_to_text\SpeechToTextPlugin.kt: (563, 7): Redeclaration: LanguageDetailsChecker
e: C:\flutter\.pub-cache\git\speech_to_text-e2268385802386e2dba7bda89d784a777d6e970e\android\src\main\kotlin\com\csdcorp\speech_to_text\SpeechToTextPlugin.kt: (604, 15): Redeclaration: ChannelResultWrapper
e: C:\flutter\.pub-cache\hosted\pub.dartlang.org\speech_to_text-2.4.1\android\src\main\kotlin\com\csdcorp\speech_to_text\SpeechToTextPlugin.kt: (37, 12): Redeclaration: SpeechToTextErrors
e: C:\flutter\.pub-cache\hosted\pub.dartlang.org\speech_to_text-2.4.1\android\src\main\kotlin\com\csdcorp\speech_to_text\SpeechToTextPlugin.kt: (46, 12): Redeclaration: SpeechToTextCallbackMethods
e: C:\flutter\.pub-cache\hosted\pub.dartlang.org\speech_to_text-2.4.1\android\src\main\kotlin\com\csdcorp\speech_to_text\SpeechToTextPlugin.kt: (53, 12): Redeclaration: SpeechToTextStatus
e: C:\flutter\.pub-cache\hosted\pub.dartlang.org\speech_to_text-2.4.1\android\src\main\kotlin\com\csdcorp\speech_to_text\SpeechToTextPlugin.kt: (60, 12): Redeclaration: ListenMode
e: C:\flutter\.pub-cache\hosted\pub.dartlang.org\speech_to_text-2.4.1\android\src\main\kotlin\com\csdcorp\speech_to_text\SpeechToTextPlugin.kt: (67, 11): Conflicting declarations: public const val pluginChannelName: String, public const val pluginChannelName: String
e: C:\flutter\.pub-cache\hosted\pub.dartlang.org\speech_to_text-2.4.1\android\src\main\kotlin\com\csdcorp\speech_to_text\SpeechToTextPlugin.kt: (71, 14): Redeclaration: SpeechToTextPlugin
e: C:\flutter\.pub-cache\hosted\pub.dartlang.org\speech_to_text-2.4.1\android\src\main\kotlin\com\csdcorp\speech_to_text\SpeechToTextPlugin.kt: (118, 26): Cannot access 'currentActivity': it is private in 'SpeechToTextPlugin'
e: C:\flutter\.pub-cache\hosted\pub.dartlang.org\speech_to_text-2.4.1\android\src\main\kotlin\com\csdcorp\speech_to_text\SpeechToTextPlugin.kt: (120, 26): Cannot access 'onAttachedToEngine': it is private in 'SpeechToTextPlugin'
e: C:\flutter\.pub-cache\hosted\pub.dartlang.org\speech_to_text-2.4.1\android\src\main\kotlin\com\csdcorp\speech_to_text\SpeechToTextPlugin.kt: (126, 44): Overload resolution ambiguity: 
public const val pluginChannelName: String defined in com.csdcorp.speech_to_text in file SpeechToTextPlugin.kt
public const val pluginChannelName: String defined in com.csdcorp.speech_to_text in file SpeechToTextPlugin.kt
e: C:\flutter\.pub-cache\hosted\pub.dartlang.org\speech_to_text-2.4.1\android\src\main\kotlin\com\csdcorp\speech_to_text\SpeechToTextPlugin.kt: (155, 22): Cannot access 'ChannelResultWrapper': it is private in file
e: C:\flutter\.pub-cache\hosted\pub.dartlang.org\speech_to_text-2.4.1\android\src\main\kotlin\com\csdcorp\speech_to_text\SpeechToTextPlugin.kt: (177, 32): Cannot access 'ChannelResultWrapper': it is private in file
e: C:\flutter\.pub-cache\hosted\pub.dartlang.org\speech_to_text-2.4.1\android\src\main\kotlin\com\csdcorp\speech_to_text\SpeechToTextPlugin.kt: (186, 32): Cannot access 'ChannelResultWrapper': it is private in file
e: C:\flutter\.pub-cache\hosted\pub.dartlang.org\speech_to_text-2.4.1\android\src\main\kotlin\com\csdcorp\speech_to_text\SpeechToTextPlugin.kt: (190, 20): Cannot access 'ChannelResultWrapper': it is private in file
e: C:\flutter\.pub-cache\hosted\pub.dartlang.org\speech_to_text-2.4.1\android\src\main\kotlin\com\csdcorp\speech_to_text\SpeechToTextPlugin.kt: (538, 7): Redeclaration: LanguageDetailsChecker
e: C:\flutter\.pub-cache\hosted\pub.dartlang.org\speech_to_text-2.4.1\android\src\main\kotlin\com\csdcorp\speech_to_text\SpeechToTextPlugin.kt: (579, 15): Redeclaration: ChannelResultWrapper

Exception: Gradle task assembleDebug failed with exit code 1
sowens-csd commented 3 years ago

That happens when you switch from a release to a git reference. It's trying to include both copies of the package. Usually fine once you run a flutter clean in the project directory.

fenchai23 commented 3 years ago

ok I did test initially, I thought it was working well, but then after a few more tries it performs the same.

I'm supposed to use speech.cancel() right?

sowens-csd commented 3 years ago

Either cancel or stop both should give the same result. I did notice in a quick look at the code that the cancel implementation has a mistake that stop doesn't. I'll fix that and try it out on Android 10 device this weekend. I don't usually have one but luckily there's one here this weekend.

sowens-csd commented 3 years ago

New version just committed on the same branch. There was a mistake in the previous version that meant it wasn't actually destroying the recognizer. Let me know if you have time to try it again.

fenchai23 commented 3 years ago

Nice, just tested it, it works! now the only thing not working for me are the listenFor and pauseFor options. But that is not as important as this fix. great job 👍

sowens-csd commented 3 years ago

The main branch has a version of this fix now. This version is SDK version sensitive so the work around is only triggered on SDK version 29. I'll publish as 2.5.0 today or tomorrow but it you had a chance to try it out before I publish that would be helpful.