A polyfill for the experimental browser Speech Recognition API which falls back to AWS Transcribe.
Note: this is not a polyfill for MediaDevices.getUserMedia()
- check the support table in the link above.
This Library is a good fit if you are already using AWS services (or you would just prefer to use AWS).
A polyfill also exists at: /antelow/speech-polyfill, which uses Azure Cognitive Services as a fallback. However, it seems to have gone stale with no updates for ~2 years.
TranscribeStreaming
permission.TranscribeStreaming
policy attached. To attach this to your role search for IAM -> Roles, find your role, click "Attach policies" and search for the TranscribeStreaming role.Install with npm i --save speech-recognition-aws-polyfill
Import into your application:
import SpeechRecognitionPolyfill from 'speech-recognition-aws-polyfill'
Or use from the unpkg CDN:
<script src="https://unpkg.com/speech-recognition-aws-polyfill"></script>
Create a new instance of the polyfill:
const recognition = new SpeechRecognitionPolyfill({
IdentityPoolId: 'eu-west-1:11111111-1111-1111-1111-1111111111', // your Identity Pool ID
region: 'eu-west-1' // your AWS region
})
Alternatively, use the create
method.
const SpeechRecognition = SpeechRecognititionPolyfill.create({
IdentityPoolId: 'eu-west-1:11111111-1111-1111-1111-1111111111', // your Identity Pool ID
region: "eu-west-1"
});
const recognition = new SpeechRecognition()
You can then interact with recognition
the same as you would with an instance of window.SpeechRecognition
The recognizer will stop capturing if it doesn't detect speech for a period. You can also stop manually with the stop()
method.
Property | Supported |
---|---|
lang |
Yes |
grammars |
No |
continuous |
Yes |
interimResults |
No |
maxAlternatives |
No |
serviceURI |
No |
Method | Supported |
---|---|
abort |
Yes |
start |
Yes |
stop |
Yes |
Events | Supported |
---|---|
audiostart |
Yes |
audioend |
Yes |
start |
Yes |
end |
Yes |
error |
Yes |
nomatch |
Yes |
result |
Yes |
soundstart |
Partial |
soundend |
Partial |
speechstart |
Partial |
speechend |
Partial |
import SpeechRecognitionPolyfill from 'speech-recognition-aws-polyfill'
const recognition = new SpeechRecognitionPolyfill({
IdentityPoolId: 'eu-west-1:11111111-1111-1111-1111-1111111111', // your Identity Pool ID
region: 'eu-west-1' // your AWS region
})
recognition.lang = 'en-US'; // add this to the config above instead if you want
document.body.onclick = function() {
recognition.start();
console.log('Listening');
}
recognition.onresult = function(event) {
const { transcript } = event.results[0][0]
console.log('Heard: ', transcript)
}
recognition.onerror = console.error
Questions, comments and contributions are very welcome. Just raise an Issue/PR (or, check out the fancy new Github Discussions feature)
MIT