alekssamos / msspeech

not official API for Microsoft speech synthesis from Microsoft Edge web browser read aloud
https://pypi.org/project/msspeech/
57 stars 10 forks source link

the voices_list_plus.json is outdated, how can I update it myself? #1

Closed pendave closed 2 years ago

pendave commented 2 years ago

Hello I find the voices_list_plus.json is outdated, as I can't find this voice for example: zh-CN-YunjianNeural

https://raw.githubusercontent.com/alekssamos/msspeech/da5904e14e7e8b383f4230c57585eaa92271a4bb/msspeech/voices_list_plus.json

:) How can I update it myself?

And where can I add custom SSML?

alekssamos commented 2 years ago

I will update this file now. About SSML, now it doesn't work and causes an error. It used to work. I'm talking about the key from the Edge browser that I use.

alekssamos commented 2 years ago

in console / shell

pip install --upgrade msspeech
python3
from msspeech import __path__ as p
from os import remove
from os.path import join
remove(join(p[0], "voices_list_plus.json"))

I did not think that the voices would change so often and have not yet provided for automatic updates.

alekssamos commented 2 years ago

About SSML, I use the Edge key, for free, unlimited, more than 16,000 hours a day and everything is fine, there are no problems. If you specify several voices or styles, everything was voiced well before and I already wanted to add the function to my library, but later it began to give the standard error 1007 and that's it. Пользуйтесь этой страницей: https://azure.microsoft.com/ru-ru/services/cognitive-services/text-to-speech/ Authorization is not permanent there, the key is updated after a while, so I won't use it in my library yet. I take the voices from there, analyzing the traffic. Try to specify SSML in a demo form, and for recording use screen recording programs, for example.

alekssamos commented 2 years ago

I have already updated the voices in the Telegram bot.

pendave commented 2 years ago

Thanks, I reinstall the msspeech for Python. I download one .exe package based on free api of Edge, which has SSML functions inside, I hope you could get some way decrypt it from the traffics for use.

From https://www.52pojie.cn/forum.php?mod=viewthread&tid=1649323

will automatically play (first 40 words of text) the voice effect according to your selected language, voice, speaking style, character, speech speed and pitch.

Download, which will generate (all text) audio for a maximum of 10 minutes (roughly 2600 words) at a time based on the language, voice, speaking style, character, speed and pitch you selected.

The audio file will be stored in the same directory as the tool and named "voice.mp3"

You can click play to try the audio again, or stop playing immediately.

Support Win10, Win11 system, you can try to listen to (automatically play the first 40 words of text voice effect), you can also download (all the text, the first 40 words of text).

You can also download (all text, each time the maximum 10 minutes, about 2600 words of audio effect).

Fix the problem of more text jamming, add three common languages, speaking styles and characters.

Keep the network open, there will be no lag, what problems arise in use can be explained directly under the comments.

The call is Microsoft's free TTS, I hope you like it.

image image

video tutorial at https://www.bilibili.com/video/BV1ct4y1H72L Download address https://wwt.lanzout.com/b02p9lkud Password:4h9h

I guess it's packaged from Python script, according to someone reports the error "Error loading Python DLL 'C:\Users\oc\AppData\Local\Temp_MEI44762\python39.dll'."

pendave commented 2 years ago

And there's another better tool which has free api and Edge api to choose from.

https://www.52pojie.cn/forum.php?mod=viewthread&tid=1638928

image

image

and a webpage version:

https://toolb.cn/textspeech

alekssamos commented 2 years ago

Added SSML support, just pass it as usual to the synthesize function or to the console command. But it differs from the documentation, it is very limited.