Open Minnemann-dzblesen opened 1 year ago
Hello, Thorium doesn't inject additional prompts, the TTS readaloud experience is based purely on authored alternate text / accessible descriptions. If Thorium injected speech prompts at runtime, would they need to match the language of the user, or of the publication metadata, or of the text content itself? I assume the user locale, so there could be a discrepancy with authored language
Team discussions so far on the subject:
The unresponded question behind is "shall we consider Thorium TTS feature as an AT".
On the positive side it would allow for a strong higlight of semantics (meaning that it makes sense to add them for producers) and will certainly serve a lot of users.
Still, the effort to design, add and maintain is to be considered and dedicated funding shall be found.
Also to consider page numbers :) see related issue #1974 and discussions #1951 and #1799.
I remember from narrating books as a volunteer, that we had a whole set of rules including how to handle special elements (like authors, page number, images, footnotes), If a user relies on audio alone, he or she will easily get confused without those prompts. Depending on the type of user, the desirable UX may differ. I suggest to draft user requirements for several user types, and decide how to address these without introducing too much complexity
Many thanks for the numerous replies, the good discussion and thoughts on this and even if there is no prompt solution to this, the drive that comes in here. We as a German institution and also many other European institutions often and primarily recommend Thorium to publishers (to check their EPUBs in fact of the approaching EAA), so a comprehensible and high quality is important to us, but I am sure about that in the process.
During our manual accessibility testing of an EPUB, we encountered the following problem: the alt text to an image is well announced with Thorium's integrated read aloud feature. However, it does NOT announce that it is an image. This might be pretty confusing for readers with impairments. The blind reader should recognize that this is an image text.
In combination NVDA or Jaws with Thorium (which is certainly not meant to be), "image" (or also "image end") is announced after or before the alt text in each case. Also the screen reader integrated in Windows or Voice Over with Apple Books announce that it is an image. We have tested with Thorium 2.2.0.
Or are we doing something wrong? Thanks.