respecting default speach preferences on mac OS

aaronr7734 commented 1 year ago

Given how little we can configure the tts on the mac (based on the examples provided) there should be an option where it just pulls the default tts settings for the mac OS system voice. Is this possible? I'm using another project that's using this crate and the tts is defaulting to Samantha no matter what we do. Thanks!

ndarilek commented 1 year ago

I'd be grateful if you could figure it out and submit a PR. I know I have a couple of those to get to, but macOS is blind-developer-hostile enough that I likely won't have the bandwidth to investigate. I do generally want the Tts::default() return value to respect platform defaults where possible, but it's not always easy and not my main use case.

Thanks.

Enyium commented 11 months ago

I did some digging on Windows with RegistryChangesView to find out where the settings are stored. There seems to be an old and a new place. I don't know whether there are also APIs to retrieve this information.

"Default" means to be chosen when the value doesn't exist.

New Place (Associated With Microsoft Narrator)

(Before you change anything, if you didn't ever change these settings, please check the SpeechVoice value in the registry, whether it exists or what value it has.)

Run ms-settings:easeofaccess-narrator with Win+R to open the dialog. TTS settings there are reflected in these registry values:

HKEY_CURRENT_USER\SOFTWARE\Microsoft\Narrator\NoRoam\SpeechVoice — string, unknown default — tts::Voice::name() plus dash and human-readable language and region. The registry key path that tts::Voice::id() returns (HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Speech_OneCore\Voices\Tokens\<voice subkey>) contains mentions of this string. This seems to be the VoiceInformation.Description property. This crate reads other properties here (impl TryInto<Voice> for VoiceInformation).
HKEY_CURRENT_USER\SOFTWARE\Microsoft\Narrator\SpeechSpeed — DWORD, default 10, range 0 to 20
HKEY_CURRENT_USER\SOFTWARE\Microsoft\Narrator\SpeechPitch — DWORD, default 10, range 0 to 20
HKEY_CURRENT_USER\SOFTWARE\Microsoft\Narrator\NoRoam\SpeechVolume — DWORD, default 100, range 0 to 100

Old Place (Smaller Amount of "Desktop" Voices)

(Before you change anything, if you didn't ever change these settings - also not in third-party software - please check the two DefaultTokenId values in the registry, whether they exist or what values they have.)

Run "%WINDIR%\system32\rundll32.exe" shell32.dll,Control_RunDLL "%WINDIR%\system32\speech\speechux\sapi.cpl" with Win+R to open the dialog from the old control panel. The settings button for each voice was grayed out for me. Available TTS settings there are reflected in these registry values:

HKEY_CURRENT_USER\SOFTWARE\Microsoft\Speech\Voices\DefaultTokenId — string, unknown default — The voice description behind the registry key that the value points to doesn't match the one from the new place, but you could find it out by mapping from HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Speech\Voices\Tokens\<voice subkey> to HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Speech_OneCore\Voices\Tokens\<voice subkey> over the values LangDataPath and VoicePath, which are identical for matching voices.
HKEY_CURRENT_USER\SOFTWARE\Microsoft\Speech\Voices\DefaultTTSRate — i32 transmuted to DWORD, default 0, range -10 to 10

I don't know whether pitch and volume are adjustable by any other means.

There's also HKEY_CURRENT_USER\SOFTWARE\Microsoft\Speech_OneCore\Voices\DefaultTokenId, but I don't know what makes it change.

We aren't able to implement Default, if we want to pass on errors of getting the default values, because Default::default() returns Self and not a Result. You can't implement a trait from another crate for a type from another crate. Maybe have an own fallible Tts::default() factory method, if that's not frowned upon.

You should definitely be able to get the default Voice without the whole Tts instantiation failing, because one may want to check whether the default voice speaks a certain language and potentially fall back on another voice. So, there should be a getter Tts::default_voice().

Then, there should also be:

Tts::default_rate()
Tts::default_pitch()
Tts::default_volume()

You could argue that these three functions, because of the simplicity of their return values, shouldn't be fallible, but fall back on a static value if reading the respective registry value fails.

BTW: Getters in Rust shouldn't start with get_, but omit the prefix altogether.

Enyium commented 11 months ago

I wasn't aware of Tts::default(), Tts::normal_rate(), Tts::normal_pitch() and Tts::normal_volume(). 🙄 I don't know how they would relate to the default_...() functions I talked about. But would it actually be best to retrieve the defaults in Tts::new(), which conveniently is already fallible? That would make it impossible to get a Tts instance, if getting the defaults fails, though.

ndarilek / tts-rs

respecting default speach preferences on mac OS #46

New Place (Associated With Microsoft Narrator)

Old Place (Smaller Amount of "Desktop" Voices)