crow-translate / crow-translate

A simple and lightweight translator that allows you to translate and speak text using Google, Yandex Bing, LibreTranslate and Lingva.
https://crow-translate.github.io/
GNU General Public License v3.0
1.81k stars 160 forks source link

Use windows local tts api #470

Open dentistfrankchen opened 2 years ago

dentistfrankchen commented 2 years ago

Describe the problem or limitation you are having

Now the program only uses Google as tts engine.

Describe the solution you'd like

Provide the user an option to use local Microsoft tts.

Describe alternatives you've considered

Additional context

VolkMilit commented 2 years ago

Now the program only uses Google as tts engine.

I don't understand what you're mean by that. Crow using Yandex TTS and in future will use Lingva TTS. Bing and LibreTranslate doesn't support TTS whatsoever.

As I understand correctly, you mean to implement this one? It is proprietary and paid (well, Google, Yandex and Bing is also proprietary, but at least API is somewhat free), requiring full SDK to be installed, not cross-platform, requiring Azure (if I understand correctly). I don't think Microsoft would allowing us to use Cortana, as this Wikipedia article stands, Microsoft TTS is not Cortana.

WolvenSpectre commented 1 year ago

Not OP but what I think OP was trying to say was supporting local OS TTS not cloud on Windows. The generic SAPI 4 and 5 voices locally.

At least I think that is what they are asking.

VolkMilit commented 1 year ago

@WolvenSpectre, wait, what? So, Windows installation have all that voice models? If it's not, than that's a bad TTS, because it will voice all unknown words letter by letter. OR it's still clouds.

Microsoft documentation still as bad as I remember it 2 years ago, but I can tinker with examples, when I have some free time. Found an example, guess where. Not in M$ documentation.

WolvenSpectre commented 1 year ago

That is what the custom dictionaries and pronunciations are for. I think the person doesn't just mean the built into windows accessibility TTS, but the standard voice models it uses that other software can utilize their own way. For example I install SAPI 5 voices and use a TTS program called Balabolka (I believe it is Romanian for "Yada-Yada-Yada" but mostly means gossiping) that for free is quite good and has pronunciation and word dictionaries for several languages. In short I think what the user was asking for Crow to support SAPI 4 and SAPI 5 Voices so they can use them if they have them installed. The dictionaries may not be complete, but if they are similar to that used in Crow, and maybe later in development includes custom dictionaries that add words and pronunciations to them would be amazing. But for now as I understand it they just want to use their preferred voices, especially if they paid for higher quality SAPI 5 ones.

VolkMilit commented 1 year ago

@WolvenSpectre, but SAPI is Microsoft speech api. Am I missing something? You still need to use it's own SDK and WinAPI to get it works.

WolvenSpectre commented 1 year ago

SAPI, as I understand it is like DirectShow and Voices created to work with it are also called by the version of SAPI Standard they were made to, SAPI 5 is what the more normal human voices are made to, and thus called SAPI 5 Voices for example. I haven't worked on the API end but from using TTS software as I understand it the dictionaries that are used with the voices under the SAPI 4 and 5 are included and are updateable/capable of checking vs multiple dictionaries including user ones.

If I am understanding it you are looking at it from the API level and we are talking from the Voices for the API level down. At least if I am understanding you and I am still understanding OP's position.

But then again I only know this stuff tangentially with working with this stuff and voices like the now out of date AT&T Natural Voices for SAPI 5, not from building it from the application level like you would.

VolkMilit commented 1 year ago

If I am understanding it you are looking at it from the API level and we are talking from the Voices for the API level down.

I just got confused. Because, apparently, SAPI and Microsoft TTS is not the same thing. But Microsoft TTS api header is actually calling sapi.h. Microsoft at it's finest.

Nevermind, it actually is. I just got confused again, because of this thing exists.

Also, for potential contributors, I trying to compile example from M$ docs, and it is require ATL and MFC.

WolvenSpectre commented 1 year ago

No its understandable. Developers, especially ones like M$ try to make things clear and simple with EUX but practically go out of their way to complicate everything on the back end for development and deployment. They don't even standardize some of the terms for frameworks and it has only gotten worse in the 2 decades since I studied IT. I halfway think they are going to start selling Azure Keyboards and Azure Screenwipes they are remaking so many things and labeling them Azure instead of cloud.

Anyways I am glad I could clear things up. Also good on you doing the time and work to frame your code for contributors at a time when allot of people groan about doing documentation in code.