Closed 55Cancri closed 11 months ago
Thanks for your question. This package is meant to be used on the web, in a browser, backed by native browser API's. The microsoft-cognitiveservices-speech-sdk
, clearly does not fall into this category.
Can you provide some more detail on how exactly you would want to merge this package with an external service and custom API? From the outset, I can tell it would require a chunk of work to adapt the custom API from MS to the native Browser Web Speech API. There is also the security issue of dealing with tokens/keys that you probably don't want to expose to the browser, so a backend would be required (this is most likely a deal breaker for integration with this package).
Check out the Web SDK example: https://github.com/Azure-Samples/cognitive-services-speech-sdk/tree/master/samples/js/browser
The Microsoft AI voices are the best in the industry. They are natural sounding and have better cadence than the native WebSpeech Api. However, I am not able to highlight the currently playing word with
microsoft-cognitiveservices-speech-sdk
. How can I mergemicrosoft-cognitiveservices-speech-sdk
with your package?
@morganney
I'm also looking for this. I have got a backend server where I get the voice from Microsoft-cognitiveservices-speech-sdk. Now, save the speech file to the server upload it to S3 and get the link and send it back to the client. Now, what I want is that, as the audio plays, the word must be highlighted. I've seen the storybook but couldn't find it more informative as there were mentioned a couple of things that were confusing. I've also been searching for Aws Polly, but couldn't find anything informative as you've mentioned in the storybook that we can use AWS Polly to get the audio data. Hope you got the point.
@naeem-hassan
If the backend using microsoft-cognitiveservices-speech-sdk
can return data matching the TTSAudioData
interface then you should be able to use the fetchAudioData
prop to get what you want. To highlight the word spoken the backend needs to return the marks
in the same format used by AWS Polly Speech Marks. There is an example story, check the source code in the repo.
@morganney morganney I don't think so microsoft-cognitiveservices-speech-sdk is returning the Marks
or is there any API or parameter to get this? Is there any free API to get the marks, as the AWS Polly is pretty expensive for me right now?
A little bit of googling shows word boundaries should be supported:
The Microsoft AI voices are the best in the industry. They are natural sounding and have better cadence than the native WebSpeech Api. However, I am not able to highlight the currently playing word with
microsoft-cognitiveservices-speech-sdk
. How can I mergemicrosoft-cognitiveservices-speech-sdk
with your package?