mmig / speech-to-flac

Example for client-side encoding of microphone audio into FLAC data
Other
71 stars 25 forks source link

Speech API not working #2

Closed devd92 closed 7 years ago

devd92 commented 7 years ago

With the update to google's cloud speech, I am not able to get the API credentials to work properly. It keeps giving a 400 () error. Any help?

russaa commented 7 years ago

Could you give some more specifics?

E.g. did you setup the Google Cloud Speech account, and obtained an OAuth token for running the test?

And I am guessing, you did some modifications to the code --for switching to Google Cloud Speech-- right?

Could you make the modified code (excerpts) available, so I can have look, where the problem may stem from?

devd92 commented 7 years ago

Hi, thanks for replying. I got an API key from my cloud speech account, and changed the path reference. The last bit of the code currently looks like this -

    var oAjaxReq = new XMLHttpRequest();
    oAjaxReq.onload = ajaxSuccess;

// oAjaxReq.open("post"," https://speech.googleapis.com/v1/speech:recognize?lang=en-IN&maxAlternatives="+alternatives+"&output=json&key="+key, true); oAjaxReq.open("post", " https://speech.googleapis.com/v1/speech:recognize?key="+key, true); oAjaxReq.setRequestHeader("Content-Type", "audio/x-flac; rate=" + sample_rate + ";"); // oAjaxReq.setRequestHeader("User-Agent", "Mozilla/5.0 (Windows NT 6.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.63 Safari/537.36;"); oAjaxReq.withCredentials = false; oAjaxReq.send(data);

I had to set 'withCredentials' to False because it was not allowing me to run from localhost.

On 13 July 2017 at 15:15, russaa notifications@github.com wrote:

Could you give some more specifics?

E.g. did you setup the Google Cloud Speech account, and obtained an OAuth token https://cloud.google.com/speech/docs/getting-started for running the test?

And I am guessing, you did some modifications to the code --for switching to Google Cloud Speech-- right?

Could you make the modified code (excerpts) available, so I can have look, where the problem may stem from?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/mmig/speech-to-flac/issues/2#issuecomment-315028943, or mute the thread https://github.com/notifications/unsubscribe-auth/AYJFKpNv5vmOYj0AfLQBeqO-9dg0g_zPks5sNec3gaJpZM4OWfPS .

-- Dev Dutta

New Delhi, India www.findingflotsam.wordpress.com http://www.creativecynic.wordpress.com

devd92 commented 7 years ago

Is it a problem if I used an API key and not an OAuth 2 token?

russaa commented 7 years ago

I am not an expert on OAuth, but I think you'd need the API key to generate a token, and then use this token for authorization of the speech-recognition-request, see e.g. Google help page for OAuth

The OAuth token would then be transferred via the request header (instead of query-parameter in the URL), e.g. something like

oAjaxReq.setRequestHeader("Authorization", "Bearer "+oauth_token);

and, I think, withCredentials needs to be enabled -- as for the CORS problem in relation to localhost: if you use Chrome, you could disable web-security when testing the page, see the notes at the bottom on the demo page.

Also:
the Google Cloud Speech API is quite different from the "normal"/older Google Speech API.

For example, the request data is a JSON object that contains the audio-metadata (audio codec, samplerate etc.) as well as the audio-data itself (e.g. encoded as base64 string), e.g. something like

$scope.sendASRRequest = function(blob) {

  function ajaxSuccess() {

    var result = this.responseText;
    console.log("AJAXSubmit - Success!"); //DEBUG
    console.log(result);

    try {
      result = JSON.parse(result);
      //format the result
      result = JSON.stringify(result, null, 2);
    } catch (exc) {
      console.warn('Could not parse result into JSON object: "' + result + '"');
    }

    $scope.$apply(function() {
      $scope.asr_result.text = result;
    });
  }

  var data;
  var sample_rate = $scope.samplerate;
  var key = $scope._google_api_key;
  var alternatives = $scope._asr_alternatives;

  // use FileReader to convert Blob to base64 encoded data-URL
  var reader = new window.FileReader();
  reader.readAsDataURL(blob); 
  reader.onloadend = function() {
    data = {
        config: {
        encoding: "FLAC",
        sampleRateHertz: sample_rate,
        languageCode: "en-US",
        maxAlternatives: alternatives 
      },
      audio: {
        content: reader.result
      }
    };

    var oAjaxReq = new XMLHttpRequest();

    oAjaxReq.onload = ajaxSuccess;
    oAjaxReq.open("post", " https://speech.googleapis.com/v1/speech:recognize", true);
    oAjaxReq.setRequestHeader("Authorization", "Bearer "+key);
    oAjaxReq.withCredentials = true;
    oAjaxReq.setRequestHeader("Content-Type", "application/json");
    oAjaxReq.send(JSON.stringify(data));
  };

  $scope.$apply(function() {
    $scope.asr_result.text = "Waiting for Recognition Result...";
  });

};

(please note, that I did not actually test the code above; it's only an example)

devd92 commented 7 years ago

Okay, will go through the JSON encoding again. Currently it's giving me this error -

"error": {
    "code": 400,
    "message": "Invalid JSON payload received. Unexpected token.\nfLaC\u0000\u0000\u0000\"\u0010\u0000\u0010\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u000b\n^",
    "status": "INVALID_ARGUMENT"

Would it be possible for you to edit the original code to conform with the updated cloud speech requirements? Or is there any other sample application that sends requests after recording FLAC the way yours does? I tried but was unable to find anything remotely close. :(

russaa commented 7 years ago

the error message is due to the fact, that the code currently sends (binary) FLAC data instead of the JSON (the data object in the code sample above)

As for changing the original code:
I need to think about it, as it would disable functionality with the old Google Speech API.
I think, if changing, it should be changed, so that both APIs could be used, but that would require some major refactoring of the code -- which would take some time.

For the time being: I expended my previous code sample above to a more complete example -- as a replacement for the $scope.sendASRRequest function in the original code (it assumes that $scope._google_api_key holds a valid OAuth token).
Let me know, if that works for you.

devd92 commented 7 years ago

Thanks for the updated sample. I replaced the requisite code, but I am now getting this error -

{
  "error": {
    "code": 401,
    "message": "Request had invalid authentication credentials. Expected OAuth 2 access token, login cookie or other valid authentication credential. See https://developers.google.com/identity/sign-in/web/devconsole-project.",
    "status": "UNAUTHENTICATED"
  }
}

I have two OAuth 2.0 variables, one is the Client ID and the other is Client Secret. Not sure how both are to be used in this scenario. API keys and service account keys are not working.

devd92 commented 7 years ago

Update: Used my regular API key and made some edits to the data sent by blob. Works now. Setting 'withCredentials' to false did not affect the program.

dansoloway commented 6 years ago

Hi, I am also trying to get this working with the Cloud API, using an API Key. Any tips? Not sure if the code included here has been added to the project, etc. Thanks!