Closed spudthebud closed 2 years ago
Hello, thank you for your detailed description.
The key problem with screen reader - web browser interactions (remember, thorium's graphical user interface is all HTML CSS JavaScript etc.) is that the screen reader's internal state - crucially, the reading cursor - is not reflected in any Web API that an application like Thorium could leverage to track reading progress, create bookmarks, insert annotation highlights, etc. We therefore have to rely on basic keyboard tab navigation (i.e. actual focus events) as well as other pointing device events (e.g. mouse click) in order to infer the user's reading location. This set of techniques is applicable to sighted and AT users, but it is suboptimal because we are missing detailed granular interaction information. The problem is particularly noticeable with screen readers because text selections and character-level cursor are not in sync with the corresponding Web API (i.e. HTML DOM Ranges). Instead, modern screen readers operate in a world of "virtual screen buffers" (I use these terms loosely as I realise they have a precise meaning in some screen reader implementations) which is highly optimised for the task at hand, but which unfortunately leave JavaScript / HTML programmers in the dark (pun intended). The modern DOM accessibility tree exposed by webviews solved many authoring problems, but mostly to the benefit of the end user / screen reader / AT user. As reading systems implementors, we still have to do a lot of guess work in order to figure out what the screen reader is doing / where it is reading. Our tests show that there are more reading cursors visual inconsistencies in paginated content (CSS columns) than in typical scrolling web pages. So generally-speaking we recommend the latter presentation style to screen reader users. We also recommend to emulate mouse clicks on the HTML text that users wish to mark as their "current reading location". This is necessary so that Thorium can report the correct DOM element within the page spread / scrolling viewport, which in turn allows us to determine the current page number (or to be precise, the nearest preceding element authored as an EPUB page break). This also allows us to build the path of headings that lead to the current location, which in turn can be correlated with the authored EPUB table of contents. To conclude, I note your point about losing the reference in the HTML publication document, when moving in and out of some GUI control. In Thorium we solved this problem by injecting a custom hyperlink labeled "underscore" at the very beginning of rendered DOMs. This is tabbable directly from the main landmark in the application GUI. This special link points to the current reading location Thorium is aware of (note: not the screen reader's own cursor which we know little about, but the document position last accessed by the user via TOC or bookmark navigation, mouse click on document, text selection, etc.)
Hi all, I have a few suggestions that may help. I hope this is of assistance.
Thank you for all the work to make this more accessible, especially to screen reader users.
Ahh, and for the screen reader users, I also have a suggestion, which is to use your screen reader's capabilities to leave a virtual place marker. That way, no matter where you are in the virtual document, you can always return right there by hitting a simple key. This doesn't mean the solutions I suggested above should not be implemented, of course, just that there does exist a recovery path for loss of focus.
Hello Sina, thank you for your feedback.
1) Thorium is a cross-platform application that doesn't have a traditional Windows "status bar" GUI landmark. Note that the "current page number" (if any) is already displayed in the "goto page" section as well as in the "where am I" popup, but there is a level of indirection to get there.
2) We can certainly add an ARIA live region (assertive) activated upon a keyboard shortcut, though I must say finding a sequence of keystrokes that works consistently in Windows, Linux, and Mac with JAWS, Narrator, NVDA, VoiceOver is not quite as easy as it seems
Some quick responses for you.
Not all screen readers, but that's ok, right? We shouldn't let perfect be the enemy of the good here and slowly work towards making sure it works for everyone.
Same thoughts on keystrokes. Then don't make them consistent. At a certain point, everyone who encounters this, at least across our dozens of clients, ends up putting in a customization dialog for keystrokes because it is a losing battle to try to find cross platform keystrokes especially when AT is thrown into the mix. Let users assign their keystrokes and so many worlds open up e.g. mapping to one-handed keyboard devices, alternative input devices, voice macros can map easier, etc.
RE your note about where page numbers are available, that's only helpful if no focus change is required of course.
Thanks so much!
Thanks for taking the issue so seriously. In the academic world, the user task of getting the current page number for quoting goes from 1st year undergrad papers to doctoral dissertations and into academic careers. If there isn't a truly usable solution, then a disadvantage will exist in universities and colleges. By usable, I follow the ISO definition of usability, where completing a user task is measured in terms of efficiency, effectiveness, and user satisfaction.
Sina's aria live region idea sounds interesting. It could be fast and effective. I imagine there is a way of trying it out? If I can chip in any way, I will. An academic publisher gave me an epub with pagination embedded.
Hi all,
At the present I write a JAWS script to get the page number automatically by opening the page dialog, extracting the number, closing the dialog and go back to the previous position. I think I'm on a good way but I need some more time to finish the scripts. Only want you to know this.
Udo
The JAWS script sounds useful, but just to say that the proper way to solve this is within the app, so it can be solved for everyone. Hoping for some actionable steps forward now that some real solutions have been identified in this thread.
I'm again offering to help with any sort of preliminary testing with Sina's suggestion if the dev team will give it a try. It would be a very helpful feature for Thorium readers in university settings. And thank you to Udo also. Clearly this is important.
Hello all, thank you for taking the time to file this feature request, and for your suggestions on how to implement it.
We are going to ship an experimental proposal in Thorium 2.0, which features the new CTRL SHIFT k
keyboard shortcut to force the screen reader to speak the equivalent of the information available via the popup modal dialog which opens when invoking the CTRL SHIFT i
keyboard shortcut, but without the disadvantages for loosing document focus.
The information is re-ordered though, in order to first speak the current "page number", i.e. the nearest preceding authored page break in the HTML document, relative to the current reading location which is typically the parent element of selected text or the text cursor, as designated via mouse click or screen reader equivalent (this mechanism depends on NVDA, VoiceOver, JAWS, etc.). The current reading location of course also corresponds to the last linked heading from the table of contents, or opened bookmark, or targeted "page number" from the EPUB page list in the navigation panel (goto page feature).
So, the spoken information starts with the page number, then progression data such as percentage within the current HTML document or audio book, and index of spine item in the reading order. Lastly, a trail of document headings is spoken (just as in the modal popup dialog opened via CTRL i
(CTRL SHIFT i
to force keyboard focus into the "where am I" progression data).
I hope this helps. Please let me know if this proposed user experience is moving in the right direction. Thank you.
Thanks for moving it forward Daniel. Sina, Udo, any thoughts? I can chip with some testing and providing feedback when 2.0 is released. Should this ticket be closed before the testing?
Hello, the GitHub issue is closed via a code commit in the develop
branch, as per our development process (the master
branch will be updated and tagged with the official 2.0 release).
Feel free to continue the discussion here and we can re-open the issue if there are remaining problems. Sometimes it is preferable to open a new issue, for example to describe suggested improvements to an existing feature.
You do not have to wait until the official 2.0 release to try out the new features. You can download Windows, Linux and MacOS installers (automated builds) from the GitHub release page: https://github.com/edrlab/thorium-reader/releases For example, the Windows app is available at the following link: https://github.com/edrlab/thorium-reader/releases/tag/latest-windows
I've been doing some testing of Thorium 2.0 with VoiceOver on Mac. Shift + Control + K causes a page number to be announced, along with some other information. It keeps the VoiceOver cursor in the same spot, which is great. However, the page number that is being announced doesn't quite match the printed book. It could be the ePub itself, I suppose. I'll have to do some more testing, including with JAWS.
Hello @spudthebud remember that the screen reader's accessibility cursor is completely proprietary and disconnected from the actual keyboard focus or mouse click interaction inside the HTML document (which we rely on to know / estimate what the current reading location is, during human interaction). In other words, the screen reader may be speaking text that is out of view (relative to the visible viewport, either horizontal paginated spread or vertical scroll extent), or that is in-view but in the absence of HTML element focus/selection, Thorium estimates the reading location by looking at the topmost text in the current visible viewport (typically, that's the top-left corner of the paginated spread / scrolling page, in the case of Left-To-Right / Top-To-Bottom languages). So in order to work around this inherent limitation in current screen reader technology, NVDA / JAWS / etc. users must use specific methods to trigger some "real" text selection / focus in the HTML document (i.e. not the screen reader's proprietary internal buffer handling).
I don't think the page number issue is usably resolved with Thorium 2,0, and so the issue should be reopened. I think I found an obvious solution.
First, I've been discussing this Thorium and page number issue with a blind researcher in a university. This is the message being relayed to this discussion:
“For students, researchers and scholars with print disabilities, every accessibility barrier we encounter slows us down and takes time away from the most valuable tasks we should be engaged in. Having a robust feature that provides us with page numbers would not only enable us to keep up with the referencing conventions of our disciplines, but would help to make citation tasks faster and more efficient. In addition, having access to page numbers quickly and easily lets us check the citations used by others, so we can focus our energies on keeping up with scholarly sources in our fields.”
It occurred to me, maybe there is an obvious solution. Even if page numbers are in the EPUB file they are not being rendered in the content. Here's an example from an academic EPUB:
span.pagebreak-rw { width: 0; font-size: 0; line-height: 0; height: 0; visibility: hidden; float: right; }
Could Thorium have a couple of settings?
The image below shows a page from an EPUB that has page break. It starts with: [Start Page 15]. I have verified this is correct with the physical item.
This second image shows some text from an EPUB and in the middle of a sentence is: [Start page 16]. I have verified this is correct with the physical item.
This third image shows some text from an EPUB. In the middle of a sentence "21" appears with paragraph break before it and after it for sighted scholars.
Button 1 would address the researcher's comment.
Button 2 would make Thorium more usable friendly for sighted researchers in universities who also need page numbers for quotation purposes.
If a reader doesn't want page numbers, then keep both buttons off.
@danielweck did you see my last comment about Thorium making the page numbers visible in the text if the EPUB has them?
Hello @spudthebud sorry for the late reply. I moved your analysis and suggestions to a GitHub "discussion" so that we can flesh out the details: https://github.com/edrlab/thorium-reader/discussions/1799
I notice some odd behaviour getting page numbers from an ePub that had pagination.
Context
Thorium 1.8 with JAWS, I am sighted and JAWS certified.
Fictitious user story to illustrate the situation:
I am a faculty member at a research university and am preparing a grant application. I am quoting an ePub for which there is a print-book equivalent which has page numbers. I would like to get the page number I am on so I can integrate a direct quote into the application. I am currently on the last word that I will be quoting. From here, I press Control + Shift + P, which opens a Thorium navigation toggle button and the keyboard focus is put into an edit. I hear the current page number. I press ESC to leave the navigation area. But the JAWS Virtual PC cursor is no longer in the text where I was when I had pressed the keys. The focus is now on the Navigation toggle button rather . It takes a lot of time to get back to the spot where I was reading.
Test procedure
Here's what I suggest as a test:
If you need help with making the task of getting the current page number more usable, I can chip in as best I can.