anteloe / speech-polyfill

This is a Polyfill for the HTML5 Speech Recognition API. It uses Microsoft's Cognitive Services as a backend. All Browsers supporting WebRTC will be supported by this Polyfill.
MIT License
21 stars 5 forks source link

Pluggable backends? #4

Open avaer opened 5 years ago

avaer commented 5 years ago

I'm interested in using this polyfill in the core of Exokit browser engine for mixed reality.

However I don't want to waste your MSDN subscription. Is there any chance of a pluggable backend so we could choose some other provider besides Microsoft?

anteloe commented 5 years ago

Hi @modulesio :) yes, I was thinking about creating plugins (Google). I just wasn't aware that people is actually interested in this polyfill :)

avaer commented 5 years ago

Definitely comes in handy when you're writing a web engine in JS and can't lean on another browser for the functionality :P.

But I realize it's an outlier use case.

anteloe commented 5 years ago

I'll probably need some time to create a MVP for that, but it's definitely worth the time :)

How critical is this feature for you? What backend(s) would you be interested to plug in?

avaer commented 5 years ago

It would be good to have a choice of Amazon, Google, Azure. Amazon is first choice since I'm crunching though a pile of credits there.

anteloe commented 5 years ago

Ok, I'll have a look in Amazon's API's then :)

I'll keep you posted here ;-)

anteloe commented 5 years ago

Hey @modulesio :)

Just as quick feedback: I have been reading a bit about AWS Transcribe Service, but I could not yet find out, how to stream live audio data to the transcribe service. Do you have any good resources on tips on AWS?

Many thanks :)

avaer commented 5 years ago

I think the docs are here: https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/TranscribeService.html#startTranscriptionJob-property

Seems the way to get back the transcription is to watch the file that it outputs to, and read that back.

I don't actually have experience with doing AWS transcriptions (though I've used Polly, which is the reverse). I just have credits to spend there, and it's what I use for all of my infrastructure currently.

avaer commented 5 years ago

I also found this, three days ago... https://aws.amazon.com/blogs/machine-learning/amazon-transcribe-now-supports-real-time-transcriptions/

anteloe commented 5 years ago

thanks for that :) I'll have a closer look into it over the weekend :)