rany2 / edge-tts

Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key
https://pypi.org/project/edge-tts/
GNU General Public License v3.0
5.38k stars 549 forks source link

pauses, voices #83

Closed jjj001 closed 1 year ago

jjj001 commented 1 year ago
  1. Thank you for this amazing piece of a code. It works perfectly, but when making audio file MS Azure ignores paragraphs/new lines and it reads as if all text was just one paragraph. Since we can't use SSML, is there a way to add pause e.g. for every new line or empty line in a text? Or is there some fix; some characters I could add on new lines which wouldn't be read but would add a pause?

  2. Idea 1: if this could change voice in a text with some voice tags, would be very useful for multilanguage texts. Idea 2: a tag with external audio which would be inserted in the final generated audio would be also useful.

  3. Where in the code could I insert a pause between each request on Azure server, so the IP won't get blocked?

Thank you again.

rany2 commented 1 year ago

I think you could just add ... in the text. So for example, "Hello world.... this is a pause." but this is odd behavior, Microsoft was previously not ignoring paragraphs/new lines.

jjj001 commented 1 year ago

Well there is a pause, but a small one, usually there are several empty lines after titles/headers, and these lines are ignored as if there were none. So chapter titles and subtitles of sections in a text are all treated as just parts of regular texts. So basically the more empty lines in a text, the longer pause should be. Unfortunately when I add "..." at each empty line, the behaviour doesn't change.

rany2 commented 1 year ago

Unfortunately there is no solution I could implement on my end. I propose that you generate a TTS one paragraph at a time.

jjj001 commented 1 year ago

Nevermind, I'll try to find a workaround. Thank you.