rany2 / edge-tts

Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key
https://pypi.org/project/edge-tts/
GNU General Public License v3.0
4.21k stars 444 forks source link

Trying to understand EDGE FULLY - Where is its SOURCE? #208

Closed GithubStable closed 3 months ago

GithubStable commented 3 months ago

Hello, The only way I have heard about EDGE is through this repositories, But I believe EDGE exists outisde this right? I tried to search for ITS SOURCES, and I was able to find is something about a Text to Speech for html pages? And also, something about voices being availble in AZURE? And the service being maybe paid?

I would like someone to explain to me how to set up EDGE, without using this repository, I just want to try to understand it better, and have better control of what's going on? Can someone give me some directions on where to start?

rany2 commented 3 months ago

Here is the edge-tts source: https://github.com/rany2/edge-tts/tree/master/src/edge_tts

GithubStable commented 3 months ago

Is this code written by you or provied by microsoft? Thanks

rany2 commented 3 months ago

By me.

GithubStable commented 3 months ago

Ok one last question: if I were to try to replicate it (without copying your code) where would be the starting point, Microsoft has intructions somewhere? That's what I meant first when I wrote "Source" Thanks^^ That was my last question.

rany2 commented 3 months ago

They definitely do not have instructions. It's the result of checking the network traffic from edge browser.

GithubStable commented 3 months ago

They definitely do not have instructions. It's the result of checking the network traffic from edge browser.

Wow! You have no idea how useful your work is by the way. I thank you sincerely. Especially that it is used in another repo that is dear to me.

GithubStable commented 3 months ago

@rany2 , I am curious though (Sorry I said the other was my last question, I have one ultimate one, very important). Do you know "rvc-tts-webui"? It uses Edge! And when using it, it allows you to use another voice model, BUT it is "tailored" to be close to the voices from edge (Ryan EN etc) My question is do you know how that works? and Do you know if it possible to do the same but other technologies, the one that let you change the emotions (and tone etc), there is another TTS "Style-Bert-VITS2" tech that let you do that but I have no idea how.. associate it.. with EDGE. I Hope you could be of help if you checked how "rvc-tts-webui" worked.

?? What do you think? Is that doable?

GithubStable commented 2 months ago

Please?

rany2 commented 2 months ago

the one that let you change the emotions (and tone etc), there is another TTS "Style-Bert-VITS2" tech that let you do that but I have no idea how.. associate it.. with EDGE. I Hope you could be of help if you checked how "rvc-tts-webui" worked.

It worked because at the time edge-tts had support for this but Microsoft has since filtered out anything that uses "custom SSML." Only features the Edge Browser supports could be used.

rany2 commented 2 months ago

See this https://github.com/rany2/edge-tts/issues/58

GithubStable commented 2 months ago

the one that let you change the emotions (and tone etc), there is another TTS "Style-Bert-VITS2" tech that let you do that but I have no idea how.. associate it.. with EDGE. I Hope you could be of help if you checked how "rvc-tts-webui" worked.

It worked because at the time edge-tts had support for this but Microsoft has since filtered out anything that uses "custom SSML." Only features the Edge Browser supports could be used.

Oh really? When did it stop? But I am still able to "change" the "voice form" generated with rvc-tts-webui" so I am surprised, while relying on edge. You should check their code and see what's up, maybe you will find something interested (the repo is achived but still available) Do you know other text to speech as powerful and GOOD as edge? (not paid ones like Eleven.. ofc)

rany2 commented 2 months ago

Wait, it works again? Are you sure of that?

GithubStable commented 2 months ago

well I am not sure it is SSML, the thing you mentioend, but yes I can make voices related to edge (with my own .pth), are we sure we are talking about the same thing? Because I don't know a lot of about SSML and I don't know about the code related to edge on that repo (also, when did it stop working?? The thing you are talking about)

rany2 commented 2 months ago

Wait, so basically I think I figured out how that works; it's not custom SSML. Basically, it seems like it generates the audio as normal with edge-tts and then using some kind of RVC library that changes the voice to something else; I think that's how it works.

GithubStable commented 2 months ago

I see. Do you think, since you code with ease, do you think you can do the same adaptation but with the libraries of Style-Bert-VITS2? Or similar (the ones with emotions you know? I think there is another repo that does emotions aswell I don't remember its name)?

PLEASE please please. :) (At least do you think it's possible? / doable?)

GithubStable commented 2 months ago

At least show me where is the part where you think there is a use of a library that changes the voice. Maybe I can learn myself to do what I am seeking (I tried to ask on the style repo but did not get participation yet) seriously do you think I can do it with other libraries,I would LOVE to change the emotions. Please tell me

rany2 commented 2 months ago

This is obviously beyond the scope of this project, but someone could always create a new project that uses edge-tts and some library to do the same thing. I might research it a bit more to see how complicated it is to do this myself, seems interesting.

GithubStable commented 2 months ago

Thank you, will you inform me of your advancements please?