Migushthe2nd / MsEdgeTTS

A simple Azure Speech Service module that uses the Microsoft Edge Read Aloud API
https://migushthe2nd.github.io/MsEdgeTTS/
MIT License
289 stars 41 forks source link

Not able fine-tune anymore #3

Closed tanchekwei closed 2 years ago

tanchekwei commented 2 years ago

No audio return from the API when sending with those tags to fine-tune pitch, pronunciation, speaking rate, etc.

Text without any adjustment will still work fine.

tanchekwei commented 2 years ago

I have switched to using the actual Azure API

There is a free tier, fortunately.

https://github.com/Microsoft/cognitive-services-speech-sdk-js

Migushthe2nd commented 2 years ago

Could you give an example call with an input string?

tanchekwei commented 2 years ago

Try:

Sometimes somebody will bring something that you <prosody rate="-51.00%">really </prosody>like. <- Not ok

Sometimes somebody will bring something that you like. <- Ok

Migushthe2nd commented 2 years ago

Long delay, but ran into this myself. The two examples you gave seem to work, however, any fine-tuning using the mstts tags indeed does not work

// does not work
Hi how are you<mstts:silence type="Sentenceboundary" value="150ms"/>
// works fine
Hi how are you

I have tried to figure out a way around this, but there seems to be a new whitelist in place that only allows the structure speak (1) > voice (1) > prosody (>=0).