googleapis / google-cloud-dotnet

Google Cloud Client Libraries for .NET
https://cloud.google.com/dotnet/docs/reference/
Apache License 2.0
924 stars 362 forks source link

WebRTC to .NetCore API, GoogleSpeechApi return nothing #2135

Closed ranouf closed 6 years ago

ranouf commented 6 years ago

Hi,

I send my webrtc audio to my .netcore api.

Angular TypeScript

record() {
    this.isRecording = true;

    navigator.mediaDevices.getUserMedia({
      audio: true
    })
      .then((stream) => {
        var options = {
          audioBitsPerSecond: 128000,
          mimeType: 'audio/webm\;codecs=opus'
        }

        if (!MediaRecorder.isTypeSupported(options.mimeType)) options.mimeType = "audio/ogg;codecs=opus";

        this.stream = stream;
        this.mediaRecorder = new MediaRecorder(stream, options);
        this.mediaRecorder.onstop = e => {
          console.log("stop recording => OK");
          const blob = new Blob(this.chunks); 
          this.chunks.length = 0;

          console.log("blob => OK");

          var reader = new FileReader();
          reader.readAsDataURL(blob)
          reader.onloadend = () => {
            this.sampleService.sendFile(<FileDto>{
              name: 'record.webm',
              content: reader.result.split(',')[1], //because first is data:;base64 and i dont want this in my content
              contentLength: 0,
              contentType: options.mimeType
            })
              .subscribe(result => {
                console.log("sendFile=> OK");
              }, error => {
                console.error(error);
              });
          }
        }
        this.mediaRecorder.ondataavailable = e => {
          this.chunks.push(e.data);
        }
        this.mediaRecorder.start();
      });
  }

  stop() {
    this.isRecording = false;

    this.stream.getAudioTracks().forEach(track => track.stop());
    this.stream.getVideoTracks().forEach(track => track.stop());

    this.mediaRecorder.stop();
  }

C#

        [HttpPost]
        [ProducesResponseType(200)]
        public async Task<IActionResult> SendFile([FromBody] FileDto file)
        {
                       var byteContent = Convert.FromBase64String(file.Content);

            var speech = await SpeechClient.CreateAsync();
            var response = await speech.RecognizeAsync(new RecognitionConfig()
            {
                Encoding = RecognitionConfig.Types.AudioEncoding.OggOpus,
                SampleRateHertz = 48000,
                LanguageCode = "en-CA"
            }, RecognitionAudio.FromBytes(byteContent));
            foreach (var result in response.Results)
            {
                foreach (var alternative in result.Alternatives)
                {
                    Console.WriteLine(alternative.Transcript);
                }
            }

            return Ok();
       }

How did I know about the good SampleRateHertz? I copy the byteContent to a localFile:

            var path = "C:\\Users\\Cedric\\Downloads\\Temp\\" + Guid.NewGuid().ToString();
            if (file.ContentType.Contains("webm"))
            {
                path += ".webm";
            }
            else
            {
                path += ".ogg";
            }
            System.IO.File.WriteAllBytes(path, byteContent);

First I played it to verify that I can listen my voice, and It works! Then I open the details tab property in the file properties and I saw 48 000 kHz

How did I know that Encoding is OggOpus? When I created the mediaRecorder I specified in the option : mimeType: 'audio/webm\;codecs=opus'

It was pain in the ass to understand that because I had no knowledge about Audio in general so for those who dont know that too, It could be helpfull.

Of course, The Google Environnment Variable is set in the LaunchSettings, I added in the environmentVariables section: "GOOGLE_APPLICATION_CREDENTIALS": "C:\MYPATH\XXX-202418-645374f15cb1.json"

About the issue I have. on the audio recorded from WebRtc, I can cleary hear "This is a test". But I always have no answer from Google Speech, the Api doesnt failed, it just returns nothing.

Did I miss something? How can I make it work?

jskeet commented 6 years ago

Are you able to publish the test file somewhere so that I can try to reproduce the problem? How the file is produced shouldn't be an issue, unless it's not actually an Ogg Opus file for some reason. (I'll check that as best I can when I've got the file.)

ranouf commented 6 years ago

Hi,

I uploaded a file in my google drive account, let met know it you download it: https://drive.google.com/open?id=176Xt_rd3ub8oyUed5mSo3ySJ4I-0pMsC

I installed MediaInfo (https://mediaarea.net/en/MediaInfo) to know which codec is used for my AudioFile. Result: 48kHz, 16bits, 1 channel, Opus

jskeet commented 6 years ago

Okay, I've requested access to download it - you should have an email for that now.

ranouf commented 6 years ago

Done

jskeet commented 6 years ago

Downloaded now, thanks.

jskeet commented 6 years ago

Okay, I think I see the problem. You have an Opus audio stream, but it's within a WebM container. (MediaInfo shows that.) As per the documentation, the Cloud Speech API supports Opus within an Ogg container.

I've managed to get the sample file from https://people.xiph.org/~giles/2012/opus/ to work, for example. (I'm having trouble with a longer sample which needs a long-running operation, as tracked in #2140 but that's a different matter.)

Your Javascript code looks like it should be able to support Ogg, so I suggest you try using that instead.

ranouf commented 6 years ago

Hi,

I try to use ogg like this: var options = { audioBitsPerSecond: 128000, mimeType: 'audio/ogg\;codecs=opus' //BEFORE mimeType: 'audio/webm\;codecs=opus' }

but I have this error:

ERROR Error: Uncaught (in promise): NotSupportedError: Failed to construct 'MediaRecorder': Failed to initialize native MediaRecorder the type provided (audio/ogg;codecs=opus) is not supported.

From what I read here: https://github.com/muaz-khan/RecordRTC/issues/58

Chrome doesnt support ogg to record audio.

Do you know what can I do?

jskeet commented 6 years ago

@ranouf: I don't know, I'm afraid. I would look through all the supported formats at https://cloud.google.com/speech-to-text/docs/encoding and see what's supported in Chrome.

I'm going to close this issue now as it's not a problem with the client libraries as such. That doesn't mean I'm not sympathetic to your plight - it just means there's no change I could really make to the client libraries themselves to help you :(

ranouf commented 6 years ago

Hi,

No pb, I understand.

To make an update for the next developer who could be interested by the same thing.

From what I understand, the sound in webm is ogg. But as the audio part is in a webm video container, Google speech can't access to the ogg audio part, to resolve this problem, I use ffmpeg to convert my webm to ogg.

Install-Package MediaToolkit

then

Download ffmpeg.exe from https://www.ffmpeg.org/download.html and Add it to your project with "Always copy properties"

then you can convert like this:

        if (file.ContentType.Contains("webm"))
        {
            using (var engine = new Engine("~/ffmpeg.exe"))
            {
                // Convert webm to ogg
                string command = $"-i \"{pathWebm}\" -vn -y \"{pathOgg}\"";

                engine.CustomCommand(command);
            }
        }

As i can see in MediaInfo (software): image

the file: https://drive.google.com/file/d/1iOVbSvp1vuxzNsb0Vzi1YGeogmIAUeus/view?usp=sharing (already shared with you, @jskeet ) But, I still dont have any result from Google Speech. @jskeet , what is the problem now?

jskeet commented 6 years ago

As per your screenshot, that's now Ogg Vorbis, not Ogg Opus. Ogg isn't the audio format - that's the container format. Opus is the audio format, and Google Cloud Speech supports Opus within an Ogg container.

ranouf commented 6 years ago

I updated the ffmpeg command like:

        if (file.ContentType.Contains("webm"))
        {
            using (var engine = new Engine("~/ffmpeg.exe"))
            {
                // Convert webm to ogg
                string command = $"-i \"{pathWebm}\" -vn -acodec libopus \"{pathOgg}\"";

                engine.CustomCommand(command);
            }
        }

Result: image

And now It works!!

Thanks a lot!

jskeet commented 6 years ago

Hooray! Very glad to hear that :)