watson-html5-speech-recognition

A Library to provide speech recognition capability in browsers.

Support

The library enables speech recognition support for any browser that includes support for either:

Web Speech API or
Web Audio API + getUserMedia support

If the browser does not support either of the above, then currently you're out of luck.

watson-html5-speech-recognition use Web Speech API when present and Watson Speech To Text service for all other (supported) cases.

Currently, the following are supported:

Webkit speech recognition
- Chrome (33)
- FireFox (>=44)
Watson Speech to Text
- Microsoft Edge
- Firefox (<44)
- Opera

Prequisites

An instance of Watson Speech To Text Service (requires a Bluemix account)
Watson Speech to Text Websocket server (provided. see Example section below)

Install

npm install watson-html5-speech-recognition

Usage

var Speech = require('watson-html5-speech-recognition');
var speech = new Speech.SpeechToText();

speech.listen({
    onStart: function() {
        console.log('starting');
    },
    onResult: function(e) {
        console.log(e.text);
    },
    onError: function(e) {
        console.log('error', e);
    },
    onEnd: function(e) {
        console.log('end', e);
    }
});

Customized Usage

If Watson speech services are engaged, the watson-html5-speech-recognition request a token from the server then communicates via websocket.

By default, watson-html5-speech-recognition assumes the token endpoint exists at /api/speech-to-text/token. If you alter the location of that endpoint, you must supply the new location via a configuration parameter upon instantiation. Like so...

var Speech = require('watson-html5-speech-recognition');
var speech = new Speech.SpeechToText({
  watsonTokenUrl: `/path/to/my/speech-to-text/token`
});

NOTE: The example server uses the watson-developer-cloud npm package to configure the token endpoint (see example/server/stt-token.js).

Example

The example contains a simple web front end, along with a backend web socket server that communicates with the Watson Speech To Text service

Setup the example

Clone the example:

git clone https://github.com/cdimascio/watson-html5-speech-recognition

Navigate to the example root:

cd example/server

Install dependencies:

npm install

Build the example:

npm run compile

Run the example:

First, be sure to complete all steps in the section above, "Setup the example"

Then,

Open stt-token.js to line 10

Set '<your-username>' and '<your-username>' to match your Watson Speech To Text Service credentials.

npm start

Try it:

Navigate to http://localhost:3000
Click the 'mic' button
Speak

About the example

The watson-html5-speech-recognition library is exposed as a node module. It, thus can be used seamlessly with build tools like webpack, browserify, jspm, etc.

For the purpose of this example, we use Browserify to generate speech.js from main.js. Once generated, speech.js can be included in your webpage via script tag. See index.html.

If you want to further customize main.js, you must regenerate speech.js. To do so:

cd example/server
npm run compile

All example files live in example/server.

UI client files:

public/main.js sets up watson-html5-speech-recognition and adds an instance to the global space.
public/speech.js is generated from main.js by Browserify. (npm run compile)
public/index.html contains application to listen for user input and output it to the screen.

Server files:

app.js creates an express server and exposes a base route for speech to text endpoints
stt-token.js instantiates watson speech to text and provides an endpoint to request speech to text authorization tokens.

License

Apache 2.0

cdimascio / watson-html5-speech-recognition

readme