brave / brave-browser

Brave browser for Android, iOS, Linux, macOS, Windows.
https://brave.com
Mozilla Public License 2.0
16.87k stars 2.19k forks source link

Unable to use microphone (affects Duolingo, Google Translate, and other sites) #3725

Open MGRussell opened 5 years ago

MGRussell commented 5 years ago

Latest status (this comment edited by @bsclifton)

This feature relies on a Speech API key that we would need to create and pay for on a per-use basis. As captured below, @tomlowenthal has looked at this.

At the moment, this is on hold. We have some other features requiring this speech to text API also. But we haven't taken any action as creating a token and installing will have a cost associated with it.

Original issue details by @MGRussell

This issue was previously opened in the browser-laptop repo here and was archived without resolution with numerous OS/versions chiming in. I am reopening it here, and reporting with Windows 7 with Brave Version 0.61.51 Chromium: 73.0.3683.75 (Official Build) (64-bit). This may be a related issue which mentions that Webspeech API is disabled within Brave.

Description

The browser does not function properly with Duolingo. The site asks for permission to use the microphone and everything seems to work fine, but it seems no input is ever being sent from brave to the site. The site only officially supports chrome and so this is something that would need to be remedied on Brave's end, if the browser is to be capable of using Duolingo.

Steps to Reproduce

The browser fails on any tests from Duolingo that asks the user to send voice data, 100% of the time.

Website problems only:

The issue does not resolve when disabling brave shields. This issue is not present on the latest version of Chrome.

honza-zidek commented 5 years ago

Hey guys, for me it's a show stopper for Brave! Please do something with it - the issue was first reported on 03.10.2016 in #4476, so 2.5 years ago!

rebron commented 5 years ago

@honza-zidek Feedback noted.

Brave-Matt commented 5 years ago

+1 -- a couple users reporting on community: https://community.brave.com/t/twitter-videos-duolingo-audio/61126/6

estebanhst commented 5 years ago

I have the same problem with Brave Version 0.64.77 Chromium: 74.0.3729.169 (Official Build) (64-bit) on Mac OS Mojave 10.14.4

Please, as some users have said, this is an issue that has persisted for years. There is no problem with the website while using Chrome.

gnarfle commented 5 years ago

I've also noticed this. On mac weirdly when I start a question on duolingo it instantly fails. It's not like it doesn't get the audio, it just somehow immediately marks it as wrong before I can even speak. Works fine in chrome.

bsclifton commented 5 years ago

Trying to reproduce here - I've created an account (bsclifton if anyone wants to add me) and am working through Spanish exercises 😄Still haven't hit a microphone one, but when I do I have some thoughts on what to try

bsclifton commented 5 years ago

I'm on macOS Mojave 10.14.5 and was shown this the first time a microphone prompt came up (after the web notification was shown)

Screen Shot 2019-05-29 at 11 08 25 PM

After clicking OK and then clicking the microphone icon as reported here, it fails instantly:

Screen Shot 2019-05-29 at 11 11 10 PM
honza-zidek commented 5 years ago

That's good that you reproduced it, after several years since the error was first reported (the microphone non-working in Brave was first reported on 03.10.2016 in #4476). Now there might be a chance someone will address the issue :) (Well, I admit it is a sarcarsm... but I somehow cannot understand how such a bug could have been ignored by the Brave team for so long time).

bsclifton commented 5 years ago

@honza-zidek I'm sorry about that - there are many issues we personally find important and would like to fix (like this one), but it's all a matter of relative prioritization and having enough time ☹️

After reproducing, I tried toggling a few flags on brave://flags, but didn't notice any difference. @honza-zidek can you confirm: you're supposed to press (and hold) the microphone button and then let go after you finish saying the word?

That is how I tried it and it did not work. I verified that the site has proper permissions in settings.

Screen Shot 2019-05-29 at 11 29 13 PM

Just today, we accepted a pull request by @jumde to disable field trials (we were using a test config, which is less than ideal) https://github.com/brave/brave-browser/pull/4551. Our next Nightly build will include that patch and I'm interested if that has any impact. Will try that and report back

honza-zidek commented 5 years ago

you're supposed to press (and hold) the microphone button and then let go after you finish saying the word?

You are supposed to press the microphone, then release it, then say the phrase. If I remember correctly, I think the way you describe works in Mondly. You may also try creating an account in Mondly or Memrise or other language learning app - they also have some speaking exercises.

The easiest way to try it is using Chrome for Duolingo.

However, I think that it is a more general issue. As I understand from #4476, microphone has never worked in Brave, has it? It seems not only Duolingo specific issue.

bsclifton commented 5 years ago

@honza-zidek good call - just tried and verified in Chrome... microphone

Click once is supposed to enable it- you say the words, then push stop when done.

Microphone definitely works (in Brave) in other situations. For example, Google Hangouts worked even in the old Electron-fork browser. Will report back any other findings 😄

bsclifton commented 5 years ago

Confirmed that with the new Nightly (0.68.2), this still does not work ☹️ Will need some more investigation

simonhong commented 4 years ago

Below test script throws network error. I think this is related with Google speech API.

var recognizer = new webkitSpeechRecognition();
recognizer.lang = "en-US";
recognizer.onerror = function(event) {
  console.log(event.error)
};
recognizer.start();
simonhong commented 4 years ago

Recording in duoringo and voice search in google.com give same below log. Aborting with error SpeechRecognitionErrorCode::kNetwork

[25606:40711:0626/155300.097507:VERBOSE1:speech_recognizer_impl.cc(542)] Device parameters: format: 1, channel_layout: 2, channels: 1, sample_rate: 48000, frames_per_buffer: 128, effects: 64, mic_positions: , hw_cap.min_frames_per_buffer: 128, hw_cap.max_frames_per_buffer: 4096
[25606:40711:0626/155300.097612:VERBOSE1:speech_recognizer_impl.cc(566)] SpeechRecognizerImpl starting audio capture.
[25606:40711:0626/155300.097642:VERBOSE1:speech_recognizer_impl.cc(586)] SRI::output_parameters: format: 1, channel_layout: 2, channels: 1, sample_rate: 16000, frames_per_buffer: 1600, effects: 0, mic_positions:
[25606:40711:0626/155300.097676:VERBOSE1:speech_recognizer_impl.cc(615)] SRI::input_parameters: format: 1, channel_layout: 2, channels: 1, sample_rate: 48000, frames_per_buffer: 4800, effects: 64, mic_positions: , hw_cap.min_frames_per_buffer: 128, hw_cap.max_frames_per_buffer: 4096
[25606:40711:0626/155300.548512:VERBOSE1:speech_recognition_engine.cc(380)] Downstream complete success: 0 response_code: 403
[25606:40711:0626/155300.548606:VERBOSE1:speech_recognition_engine.cc(841)] Aborting with error SpeechRecognitionErrorCode::kNetwork
[25606:40711:0626/155300.550244:VERBOSE1:speech_recognizer_impl.cc(835)] SpeechRecognizerImpl closing audio capturer source.
[25606:40711:0626/155300.551116:VERBOSE1:speech_recognizer_impl.cc(729)] SpeechRecognizerImpl canceling recognition.

SpeechRecognitionEngine interacts with google service -https://www.google.com/speech-api/full-duplex/v1 for speech recognition. See https://cs.chromium.org/chromium/src/content/browser/speech/speech_recognition_engine.cc?sq=package:chromium&dr=C&g=0&l=34

cc @rebron I think this issue needs google's speech recognition service.

bsclifton commented 4 years ago

Per discussion with @simonhong, this is also failing in Chromium

To use this, you need to request keys: https://www.chromium.org/developers/how-tos/api-keys

And we'd want to proxy the call on our side, of course

rebron commented 4 years ago

cc: @tomlowenthal for api-key

simonhong commented 4 years ago

@jumde Whenever when user tries to use voice recognition service such as clicking mic button in google.com, brave will send request to google service - https://cs.chromium.org/chromium/src/content/browser/speech/speech_recognition_engine.cc?sq=package:chromium&dr=C&g=0&l=34

tomocrafter commented 4 years ago

same problem here, I cannot use microphone with Duolingo, Google Translate on Brave Browser.

tildelowengrimm commented 4 years ago

This Google Speech API topic spans multiple issues on this repo. I'm working on it in brave/internal#608. When I resolve things there, I'll come back here with the next technical steps.

Viking8 commented 4 years ago

FYI, this is still not working as of Brave version 0.67.124 Chromium: 76.0.3809.100 (Official Build) (64-bit)

GrangeBeach commented 4 years ago

this feature is the only reason i am still using chrome. are we sure it will become available in brave ?

adnanmakda commented 4 years ago

Even Google docs voice recognition not working

tildelowengrimm commented 4 years ago

@GrangeBeach I am not sure that this will be available in Brave.

hexcowboy commented 4 years ago

Still have this issue

Brave 0.69.135 Chromium 77.0.3865.120

ryanbr commented 4 years ago

Could be related to issues on https://speechnotes.co/ (reported here; https://community.brave.com/t/very-famous-site-not-work-in-brave-but-in-chrome-work/92765)

jeffguillaume commented 4 years ago

FYI, still having this issue on Brave 1.0.0 Chromium: 78.0.3904.97 (Official Build) (64-bit).

I can't use Brave as my browser until Hangouts works; it's my primary communication method for all calls & texts.

(I refrained from downloading & evaluating Brave until your 1.0 release in the hopes that these kinds of things would already be sorted out. How long until this is working? I can't use Brave until then.)

jumde commented 4 years ago

@jeffguillaume - This issue is about Duolingo. Hangout works for me in 1.0.0 Chromium: 78.0.3904.97 (Official Build) (64-bit) on MacOS. Are you using a different OS?

jeffguillaume commented 4 years ago

This issue is about Duolingo. Hangout works for me in 1.0.0 Chromium: 78.0.3904.97 (Official Build) (64-bit) on MacOS. Are you using a different OS?

@jumde Yes, Windows here (10 Pro / Version 1909 / OS Build 18363.476).

But this isn't restricted to Duolingo, per @tomlowenthal :

This Google Speech API topic spans multiple issues on this repo. I'm working on it in brave/internal#608. When I resolve things there, I'll come back here with the next technical steps.

That was almost 6 months ago.

honza-zidek commented 4 years ago

FYI, still having this issue on Brave 1.0.0 Chromium: 78.0.3904.97 (Official Build) (64-bit).

I can't use Brave as my browser until Hangouts works; it's my primary communication method for all calls & texts.

(I refrained from downloading & evaluating Brave until your 1.0 release in the hopes that these kinds of things would already be sorted out. How long until this is working? I can't use Brave until then.)

Actually, I don't understand how they could have increased the version number to 1.0.0 for a browser in which the microphone does not work... Hey, Brave guys, we are in 2019! The microphone in a browser is not a nice-to-have feature anymore.

bsclifton commented 4 years ago

+1 from https://github.com/brave/brave-browser/issues/4763

Description

cannot turn on voice input in google translate.

Steps to Reproduce

  1. https://translate.google.com
  2. select a input language
  3. click the mic button

Actual result:

mic button will get turned off in a second later

Expected result:

mic button stays enable, and start recording.

bsclifton commented 4 years ago

+1 from https://github.com/brave/brave-browser/issues/2690

Description

Any time you attempt to use voice search on Brave desktop, it displays a "no internet connection" message. This is after allowing access to your machine's microphone for both the site and Brave itself. image

Steps to Reproduce

  1. Go to https://google.com/
  2. Click the microphone icon image
  3. Attempt to search by voice.

Actual result:

"No internet connection" is displayed.

Expected result:

Successful voice search

jeffguillaume commented 4 years ago

I presume this is a very high priority issue as it spans multiple highly trafficked sites, and you just released Brave as production ready (1.0.0).

What's the effort currently going into this?

This internal Github issue is a black hole for us normies, so the only progress updates we can glean are from here. Thanks for working on a privacy-focused browser! Can't wait to be able to make the switch when this issue is resolved.

tildelowengrimm commented 4 years ago

Hi folks, sorry for the lack of update here. It's a bit of a gnarly issue, and it gets me hopping mad.

First of all: this isn't a problem with the microphone — it's a speech recognition API. Chrome ships with a non-standard API used for speech recognition. Websites which call the API are asking the browser to transcribe audio on behalf of the website and send the site the transcribed text (not the audio). When a site calls this API in Chrome, Chrome sends the raw audio to a Google server for transcription. The Google server parses the raw audio, and send the transcribed text back to Chrome. Chrome then passes the text to the website.

There are two problems with this. The simple straightforward problem is that Brave doesn't have access to that Google transcription service. It's a paid service from Google which Chrome gets to use for free. If Brave wanted to use it, we'd have to pay Google for the privilege. The second and much more substantial issue is that I don't think that anyone reasonably expects clicking a microphone icon on Duolingo to result in Brave sending their audio to Google.

Honestly, I think that this design in Chrome is absolutely ridiculous and I was completely flabbergasted when I learned how it worked. We've had some conversations with Google about it, and the outcome is more or less (1) that they don't see what the issue is, and (2) they won't give us access to this online transcription service unless we pay for it.

So the upshot here is that Google has built speech recognition on Google sites like Google.com and Google Translate so that they depend on a non-standard API in Google's web browser which actually just calls back to a Google transcription service. You may be wondering why Google.com and Google Translate don't just take the audio and use that Google transcription service behind the scenes rather than doing this elaborate approach involving the browser. Reader, I have exactly the same question. But the outcome is that speech recognition on these Google sites only works in Google's browser, which might be all the answer you need.

Unfortunately, I don't think there's any way forward here. Our current plan is to disable things so that the microphone icon doesn't show up. Obviously, that's not a great solution: it's more like hiding the problem. But I don't know what else to do. And this is all Google's fault.

honza-zidek commented 4 years ago

@tomlowenthal Thanks for the explanation, now the issue is more clear, more understandable and more pardonable :)

Following your explanation, I checked the behaviour of Duolingo in other browsers and I found a stunning fact which I had not been aware before, not having used these browsers for Duolingo: Duolingo does not offer the voice exercises in them, and not even shows the option "Microphone" in the Profile configuration at all:

Duolingo Profile Configuration

This is probably done on the server side in Duolingo as I didn't find any traces of the <label for="enableMicrophone">Micrófono</label> in the page source code when displayed in Edge or in Firefox.

And I agree with you that

I don't think that anyone reasonably expects clicking a microphone icon on Duolingo to result in Brave sending their audio to Google

I simply do not know what to say and how to react :(

honza-zidek commented 4 years ago

@tomlowenthal Maybe one idea: talk to the Duolingo team about the problem:

honza-zidek commented 4 years ago

@tomlowenthal And one more thing: although now I fully understand the depth of the problem, still I cannot agree with

this isn't a problem with the microphone

because from the user's point of view it simply is a problem with the microphone. For me (and probably for some other users, too) it's a showstopper why I cannot make Brave my default browser. But at least now I will not blame you for ignoring the problem like I had suspected you before you explained it.

jeffguillaume commented 4 years ago

@tomlowenthal Thanks for your in-depth response, it's greatly appreciated.

& I agree with you, I wouldn't expect nor want Brave to send any data to Google, Inc. That's the reason we're here — privacy & transparency.

This seems like a no-win scenario, at least for Google Hangouts/Voice users like me. We're beholden to Big G if we want to use those services (I do). Nothing is "free" and my apparent cost for using it is that it requires Google® Chrome. (Firefox is out since the Hangouts extension is only for Chrome, hence why I was excited about Brave in the first place.)

Voice Search, and other transcription-based services in general, are going to increase in use, though. Brave should figure out a way to support it if you want to be competitive.

ethanbb commented 4 years ago

This makes sense, but I still have a question: why does speech recognition work in Brave for Android? For example, this page works on Android but not on Windows. Is this because the same API call redirects to the native (Google) phone speech recognition functionality in the Android version, which all apps have access to unlike on desktop?

The MDN page about this feature says, "Generally, the default speech recognition system available on the device will be used for the speech recognition — most modern OSes have a speech recognition system for issuing voice commands. Think about Dictation on macOS, Siri on iOS, Cortana on Windows 10, Android Speech, etc." Is this a potential way forward - for example, for the Windows version, could we utilize the Cortana API? Seems like this might be a starting point: https://docs.microsoft.com/en-us/windows/uwp/design/input/speech-interactions

clay53 commented 4 years ago

Could Brave redirect to some other speech recognition API or do client-side speech recognition? I'm not sure how this would work but I can't find a reason why it wouldn't as long as you can get a redirect in. If it requires sending audio data to some other "free" service, it could be made a separate microphone permission.

Websites which call the API are asking the browser to transcribe audio on behalf of the website and send the site the transcribed text (not the audio). When a site calls this API in Chrome, Chrome sends the raw audio to a Google server for transcription. The Google server parses the raw audio, and send the transcribed text back to Chrome. Chrome then passes the text to the website.

From this description, it seems that all Brave really needs to do is take an API call and respond with the transcribed text. I don't know how fully-fledged other speech recognition services are but I'd be much happier with partial support than no support at all. I simply cannot go back to Chrome (especially on my Chromebook, ironically, Chrome just runs too slowly when using Linux).

tildelowengrimm commented 4 years ago

Turning audio into transcribed text isn't a trivial thing to do. Most services of this sort of thing are priced based on usage, and building the UI to indicate what's happening with your audio isn't trivial either.

ethanbb commented 4 years ago

@tomlowenthal What about the Windows Speech Recognition API? It doesn't say anything about pricing, I'm pretty sure it can be called from any desktop program.

It uses a grammar format called SRGS instead of JSpeech that the WebSpeech API uses, but it looks like one is probably convertible to the other.

ethanbb commented 4 years ago

Although to follow up, to keep it really FOSS, it would be really great if there were a nonprofit service offering a cross-platform speech recognition API with fair prices and a good privacy policy.

hexcowboy commented 4 years ago

What if Brave caught audio going to Google Speech Recognition and handled it in the browser with something like https://github.com/TalAter/annyang? Realistically the sites having these issues should be using FOSS speech recognition in the first place.

ethanbb commented 4 years ago

@jackno I don't think that particular library will work. In the FAQ (docs/FAQ.md) it says:

annyang works with all browsers that implement the Speech Recognition interface of the Web Speech API (such as Google Chrome, and Samsung Internet).

which is the issue here to begin with.

bsclifton commented 4 years ago

@jackno annyang uses the Google Speech API unfortunately (which is what we need to license): https://cloud.google.com/speech-to-text/

The software itself (Chromium and Brave) is open source and cost free- but service implementations are black boxed and have a cost associated ☹️

Brave-Matt commented 4 years ago

+1 from Community for Babbel.com: https://community.brave.com/t/my-microphone-doesnt-work-with-brave/115876/14

ethanbb commented 4 years ago

I am going to try to create an extension that relays Web Speech requests to Windows's Speech Recognition API. I believe it should be possible. I'll update this thread with any progress.

ethanbb commented 4 years ago

To update, apparently Microsoft speech recognition is available for 17 languages: Catalan, Chinese, Danish, Dutch, English, Finnish, French, German, Italian, Japanese, Korean, Norwegian, Polish, Portuguese, Russian, Spanish, and Swedish.

At first I thought it was much more limited based on the language packs available to download for Cortana etc., but there are "Microsoft Speech Platform" speech recognizers available for all these languages here. Although it's a bit of an older technology, so I'm not sure how accurate all the engines are.

Best-case scenario, that should easily cover most use cases for e.g. Duolingo.

Edit: since online SR is probably much better, I might add an option to enable that if possible for the 9 languages it's available for (may require a recent Windows update and turning on online speech recognition in privacy settings).

mathias636 commented 4 years ago

Tip for Windows 10 users

You can use the shortcut Win + H to use the speech-to-text. Win + Space bar to switch the language.

I've been using this on duolingo to improve my speaking skills in english and it works very well. However, it doesn't work to use in the Duolingo speech lessons.

mckatoo commented 4 years ago

The same problem in: Linux Mint 19.3 64-bit Version 1.8.86 Chromium: 81.0.4044.129 (Official Build) (64-bit) On whatsapp web it is informed that no microphone was found. image image