edrlab / thorium-reader

A cross platform desktop reading app, based on the Readium Desktop toolkit
https://www.edrlab.org/software/thorium-reader/
BSD 3-Clause "New" or "Revised" License
1.64k stars 145 forks source link

Speech synthesis does not read the content of pop-ups #2099

Open CLnordcompo opened 2 months ago

CLnordcompo commented 2 months ago

Speech synthesis does not read the content of pop-ups (notes and description texts placed in pop-ups).

https://github.com/edrlab/thorium-reader/assets/163836608/f03e4e74-09ee-49c4-8ef5-d8e19e0392ef

danielweck commented 2 months ago

Thank you for reporting this problem. This is both a feature and a bug: it is a feature because our code explicitly excludes the information inside the popup footnote dialog during the TTS readloud (because it is considered UI, not publication contents), but obviously this is also a bug because the footnote contents should be announced after the user activates the hyperlink, and TTS readloud should resume at the calling site once the popup dialog is closed.

The current status is that this feature is quite simply not supported, we have not started implementing it yet. From the top of my head, I can see some obvious technical challenges but also some UX questions. For example Thorium offers a "simple view" for TTS utterances that displays on top of the publication's content, and which offers no interaction with footnote references at all (in other words, the user cannot even navigate to linked footnotes).

Final thought: one temporary solution could be to disable popup footnotes during TTS readloud, as this would cause hyperlinks to behave normally (i.e. jump to the note contents in the current document or in another "chapter") in which case the user would be able to resume playback at the location of the note reference by using Thorium's navigation history or with authored backlinks. Still, even in this case there is a UX problem with TTS readaloud not being able to identify the scope / end of the targeted note content (typically, speech utterances are streamed from beginning to end of document automatically unless the user interrupts the process)

danielweck commented 1 month ago

Related issue by @LaZay

https://github.com/edrlab/thorium-reader/issues/1923

This only nice to have.

So far, footnotes are not read aloud when TSS is activated. It would be nice to let people choose their own user preferences for TTS on this (as assistive technologies do for the blinds).

- epub:type=noteref, footnotes, footnote, endnotes, endnote
- role=doc-noteref, doc-footnote, doc-endnotes, doc-endnote