edrlab / thorium-reader

A cross platform desktop reading app, based on the Readium Desktop toolkit
https://www.edrlab.org/software/thorium-reader/
BSD 3-Clause "New" or "Revised" License
1.79k stars 154 forks source link

Choice of a custom voice for TTS #1151

Closed llemeurfr closed 3 years ago

llemeurfr commented 4 years ago

Some users have installed specific voices on their Windows system and would like to use them when using Thorium in TTS mode. But there is no UX allowing such a choice.

"I cannot change the narrator voice it uses to the one default on my system (Microsoft Hazel). The read out loud voice in Thorium is very grating and robotic. I would appreciate it if your program can utilize more TTS voices similar to the new Microsoft Edge browser and Adobe Reader."

"I have bought a couple of IVONA and Nuance voices (more natural sounding) These voices can be used directly in windows, however I cant change them to be used in your Thorium client. "

This issue is different from #1130 but still related. The 1130 issue is about the language selection of the system voice vs the language indicated in the publication, and the possible language override (e.g. from fr-FR to fr-CA).

danielweck commented 4 years ago

Choosing "preferred" TTS voices is definitely a very desirable feature.

The other issue (1130) highlights the fact that synthetic speech voices support specific languages, therefore the user-selected voice(s) should not be naively applied to the publication document during read aloud. Case in point with the fr-FR and fr-CA scenario. Granted, multilingual publications are not as common as single-language ones, but Thorium's current implementation delegates the responsibility of selecting voices suitable for specific languages to the underlying TTS engine. For example, with HTML documents that contain markup which combine different languages (i.e. different xml:lang on various paragraphs / sentences / words nested deep inside the root html element which usually specifies the global / default locale), the TTS implementation in Thorium passes the language information to the underlying TTS engine, without forcing a particular voice. If / when support for user-selected voice(s) is implemented in Thorium, we must make sure not to tell the TTS engine to use a voice with missing support for the actual text locale, otherwise it will sound garbage, or with the incorrect accent / locale (e.g. Canadian French vs. mainland France).

danielweck commented 3 years ago

Fixed via https://github.com/edrlab/thorium-reader/commit/ba01cb6bd74b2dadf37667bc1eb27c30f4cb13d1 (and https://github.com/readium/r2-navigator-js/commit/946265d4adb91126cd2937205453a756cb5b9962 in the r2-navigato-js component)

Please read some explanations here: https://github.com/edrlab/thorium-reader/issues/1130#issuecomment-817189211