React speech recognition does not work in chrome for mobile

ratnabh commented 3 years ago

Hi there, the code

import SpeechRecognition, {
  useSpeechRecognition
} from "react-speech-recognition";

SpeechRecognition.startListening({ continuous: true });

works very well in chrome browser on my macOs but doesn't works in mobile devices on chrome, have tried on various devices but it only listens for the first time and then after i hit the api with my finalTranscript it automatically stops listening. I wonder do i have to do something else in mobile devices to make it worse ?

Thanks

Montezi commented 3 years ago

Hi @ratnabh did you succeed solve this problem? I have the same problem and I still can't find any solution.

ratnabh commented 3 years ago

Hi @Montezi , unfortunately not. I have even tried on various devices but seems some problem with the library itself :(

ratnabh commented 3 years ago

@Montezi the core web speech api is working perfectly fine for this purpose, i suggest you to try that. You can just trigger recogniton.start() again to make it do so.

JamesBrill commented 3 years ago

Hi @ratnabh Thanks for raising this issue. Can you provide some details to help me reproduce this?

Version of the library used
Version of Chrome
iOS or Android?
The full code of your React component so I can see how you're using the library

My initial thoughts:

I'm assuming this isn't on iOS as, last time I checked, the Web Speech API isn't supported on any browser in iOS.
There used to be a bug in Google's implementation of the Web Speech API on Android. This library has a workaround for that. It's possible that in newer versions of Chrome, that bug has been fixed and the workaround is now causing this issue.
If the core API works but the library doesn't, a regression may have been introduced to the library or Chrome on mobile has changed (as web library authors, we are constantly playing catch-up with browser vendors). Either way, I'm keen to reproduce and address this.

ratnabh commented 3 years ago

Hi @JamesBrill , thanks for responding. Version of library used - 3.7.0 Version of chrome 89.0.4389.105 Android 10 Here is the link to my github repo https://github.com/ratnabh/React-Redux-with-new-hooks-api-multireducer-

To run the project

npm install
npm start

Then to run this project on mobile u could use your system ip address followed by your port like http://192.xxx.x.222:3000

To run this on chrome in mobile you may need to do the following steps below as chrome doesnt allow for microphone on http

Navigate to chrome://flags/#unsafely-treat-insecure-origin-as-secure in Chrome
Find and enable the Insecure origins treated as secure
In the textarea write down http://192.xxx.x.xxx:3000 and then enable it and relaunch chrome

Now you will see on presing button the mic turns on, try speaking something and then an mock apis hits up which after returning response from server again runs the code to turn on the mic, but the issue comes up when u wait for 1-2 second, the mic keeps on turning off and on again n again. I need to capture the transcript again spoken by user (will close mic after 5 second of inactivity)

Thanks for your time !

JamesBrill commented 3 years ago

I was able to reproduce some weird behaviour on Android Chrome with a simple dictaphone component that enabled continuous listening. On desktop, it behaved as expected. But on Android Chrome, I observed the following:

The microphone turns off when you stop speaking. This is the behaviour of continuous: false ("discontinuous listening")
The microphone automatically restarts after it turns off. This is react-speech-recognition trying to keep the microphone on when something other than the user tries to stop the microphone during continuous listening

In other words, the continuous setting doesn't seem to work any more on Android Chrome, which is disappointing. The constant restarting (and probably the annoying beeping noise that Android makes when it turns the microphone on) that you're seeing is the browser misbehaving and the library trying to compensate for it.

I probably added the logic for automatically restarting in that case to compensate for weird browsers like Android Chrome. And indeed, it does work to some degree - you get a continuous-ish experience. However, I don't remember the delay between restarts being so severe - previously, they were so short they were invisible whenever I tested on Android Chrome. Now Android's delay between restarts is long enough that speech can be missed completely. It's hard to tell if this is due to a change in the browser or in Android itself (the OS ultimately controls the microphone). Others seem to have encountered this issue recently.

You said that the vanilla Speech Recognition API (i.e. the API without this library) works for your purpose. Do you mean that you were able to get continuous listening working on Android Chrome? I tried using the raw API myself in this way and could not get this behaviour on Android Chrome. However, with continuous turned off, I was able to get react-speech-recognition to behave sensibly, despite the annoying beep every time you tap.

So I'm pretty sure there's a limitation in Android that we'll have to workaround here. I'll be able to help you find alternatives more efficiently if I understand your use case better - a brief specification of what user experience you're trying to create will be useful. Some thoughts on what you could do:

I'll first describe a classic "push to talk" button. When the user presses the button, you want to start listening with a fresh transcript. When the user finishes speaking, you want to do something with what they just said. This is the most common use case for the Speech Recognition API and react-speech-recognition.
It looks like you're doing something really similar. When the user says something, you turn off the microphone until you get a response from your API. And then you start it up again. I'm not sure if you intended this, but the first time you listen it's "discontinuous" and subsequent times it's continuous listening.
Does it need to use continuous listening? This is generally only needed if you want the user to create a long piece of text - e.g. write notes using voice. Discontinuous listening is much more common and follows a "command" pattern where the user says a short phrase, which activates something. In that mode, the microphone will automatically get turned off after the user finishes speaking, and the transcript reset. If you can live without continuous listening, this will avoid the weird behaviour on Android Chrome.
If you do need continuous listening and you want it to work on Android Chrome, then you do have the option of using a polyfill, which should work on all browsers and platforms. There's only one available for now (Microsoft Azure), but you can read up on that here.

If Android Chrome really does not support continuous listening in a usable way, I will update the README to warn about this. Perhaps I can provide a better fallback behaviour or expose a supportsContinuousListening property.

P.S. A tip for testing your locally-hosted web app on mobile and with HTTPS. Use ngrok. It allows you to proxy localhost to a public URL with HTTPS. You have to sign up, but it's free. You can also use this to share your local build with people on other networks. I've been using this to test local builds on mobile and share with stakeholders for years.

ratnabh commented 3 years ago

@JamesBrill thanks for your response and letting me know ngrok, and yes without 'continious:true' it works fine on android chrome.

JamesBrill commented 3 years ago

Cool, I hope you're unblocked now. I've made a new release that gives you the option of disabling continuous listening on browsers that don't support it. See v3.8.0.

ratnabh commented 3 years ago

Thanks @JamesBrill , also i wanted to know does resetTranscript on useEffect cleanup doesn't work properly ? Actually i was trying to show the transcript to the user, on unmounting of the component, transcript doesn't get cleaned so next time when user mounts that component again, the previous transcript is still shown. :(

useEffect(() => {
    if (!SpeechRecognition.browserSupportsSpeechRecognition()) {
      setIsListening(false);
    } else if (SpeechRecognition.browserSupportsSpeechRecognition()) {
      setIsListening(true);
      SpeechRecognition.startListening({ continuous: true });
    }
    return () => {
        resetTranscript();
      SpeechRecognition.abortListening();
    };
  }, []);

JamesBrill commented 3 years ago

@ratnabh I think the issue is actually in abortListening - I have a feeling there's a bug in that method that causes old transcripts to hang around. Your solution might be as simple as using stopListening instead. Let me know if that works for you.

Normally, the transcript should reset by itself when remounting a component that uses useSpeechRecognition. Here is a similar example component that will render an empty transcript again when unmounted and remounted (the effect isn't even needed here - I'm just using that to mirror your example):

import React, { useEffect } from 'react'
import SpeechRecognition, { useSpeechRecognition } from 'react-speech-recognition'

const Dictaphone = () => {
  const { transcript, browserSupportsSpeechRecognition } = useSpeechRecognition()

  useEffect(() => {
    SpeechRecognition.startListening({ continuous: true })
    return SpeechRecognition.stopListening
  }, [])

  if (!browserSupportsSpeechRecognition) {
    return <span>No browser support</span>
  }

  return <span>{transcript}</span>
}

export default Dictaphone

Calling resetTranscript on cleanup should be unnecessary. As mentioned above, I think you just need to use stopListening instead of abortListening. I'll raise a separate issue about abortListening.

A couple of unrelated points:

You shouldn't need to manage your own listening state - this is provided by useSpeechRecognition
I see you are starting the microphone on mount. Make sure that you are triggering this in response to a user action such as a button click. Some browsers, especially if using a polyfill, have regulations that prevent you from starting the microphone without the user doing anything (the user will see an error in that case). So it's generally good practice to tie startListening (or the mounting of this component in your case) to a button click

JamesBrill commented 2 years ago

Closed due to inactivity.

JamesBrill / react-speech-recognition

React speech recognition does not work in chrome for mobile #89