Closed mltony closed 10 months ago
Many thanks for the interesting read @mltony ! While I certainly agree that issues you've described are severe and they should be solved by us, I am not certain the proposed solution goes in the right direction. Starting with the web based coding editors I see two non exclusive solutions which seems more promising:
contentEditable
for their editors. They're not doing so due to various bugs / slowness in the browsers, but I'm afraid many of these are not reported to browser developers, since the majority is fine with status quo.Why I believe implementing fixes in the web browsers is nicer:
As to the Google Docs issues most of these are solved in JAWS, which shows that no additional API's are needed. Note that their way of implementing quick nav is a bit strange, we could certainly do it better, but it requires just switching to the quick nav mode and then familiar single keys work. This is possible in JAWS since they have an ability to implement scripts for specific web apps, something which NVDA lacks and implementing it probably would be useful for myriad of other use cases. I've skimmed scripts for Docs, they mostly rely on IA2, using a custom solution for accessing HTML directly (we probably could do the same with ISimpleDOM
).
Given that JAWS works better here, I see very little chances of convincing Google developers to implement a different standard.
Note also that if contentEditable
would work better, additional SAAS office suites would use it, essentially allowing accessibility to work the same way in most of them.
There is of course a matter of limited resources. I'd be a real pita to spent time on the JS Accessibility Bridge, find out that it has not been implemented anywhere or almost anywhere, so working on fixing text area issues in browsers / adding ability to script web apps has to be implemented after all. What I have proposed certainly requires a lot of time and skills, but, if done, guarantees results. The same cannot be said about the JS Accessibility Bridge idea IMHO.
If I had your programming skills, and be affected by the issues described in the initial comment to the extend you are, I'd at least try to investigate these avenues.
This is a proposal for future direction NVDA project can take, rather than a concrete feature request. I would like to see if there is an alignment with NVDA devs on whether this is a reasonable direction. I apologize for creating another huge issue - it'll probably take ~10 minutes to read it; however since I am proposing a brand new direction, I had to lay out extraordinary justifications in order to convince people that this is right direction. There are many headings below for ease of navigation.
TL;DR
The world is rapidly moving towards web technologies and away from natively executed computer programs - think of Google Docs or VSCode - just to name a few examples. Current state of accessibility of many web applications is unsatisfactory (examples below) and the reason is that communication between screenreader and web applications happens through accessibility APIs (mostly IAccessible2) that either don't account for needs of modern web apps (Google Docs) or are poorly implemented (e.g. large text in edit boxes in Google Chrome is too slow). I propose to start working on a fundamental solution to this problem: build a JavaScript accessibility bridge to allow screenreaders to communicate directly with JavaScript environment of web apps running inside browser instead of relying on IAccessible2 with all its flaws. Below I list some concrete problems of status quo and propose a fundamental solution.
Problems with accessibility of web applications
Google Docs
The main problem with Google Docs is that they effectively implemented their own screenreader running inside the browser. While this is understandable (what else could Google devs have done at the time to make it accessible to screenreader users), and I feel gratitude to Google devs for making Google Docs accessible; this is not a good solution in the long run. Here are some reasons why implementing your own screenreader in every web-based office solution is a problem for screenreader users:
RightArrow
Google Docs speaks the previous symbol, instead of current one; Same applies toControl+RightArrow
. This is inconvenient for NVDA users (and probably Jaws users as well). It is wrong that NVDA and Jaws users have to relearn basic character and word navigation commands in order to be able to work with Google Docs. This is mitigated by turning on Braille mode though.Control+Alt+N Control+Alt+H
. Jump to next table is even worse:Control+Alt+Shift+N Control+Alt+Shift+T
. On one hand it is understandable, since for web apps the selection of keystrokes is limited, but on the other hand the end result is still bad. Not to mention that in practice these shortcuts are often unreliable.VSCode and Monaco editor
VSCode is an electron based application running from within chromium-like browser, and this means that all the communication between screenreader and its UI elements happens through IAccessible2 with all its limitations. Monaco is a browser based code editor with rich set of features written in TypeScript. While claiming to be accessible, it has one big accessibility flaw: it only provides 500 or so lines of your source code file through accessibility API at a time. More details can be found in microsoft/vscode#41423, which is blocked on this and that chromium issues. So if NVDA tries to retrieve some lines in current editable via TextInfo API, it would only see 500 lines, no more - VSCode effectively truncates the document. Why having access to only 500 lines of code is not enough?
CodeMirror editor
CodeMirror is another popular open-source online code editor and is a direct competitor to Monaco. It is being used most notably in Chrome and Firefox Developer Tools, Coderpad online editor (frequently used by major IT companies for online interviews) and GitHub's in-browser edit feature (full list can be found on CodeMirror real world uses page. The previous version CodeMirror 5 was not accessible by screenreaders as it only presented an empty text area - this was tracked in codemirror/codemirror5#4604 - notably CodeMirror maintainers are stating that adding accessibility support will require a major redesign and even proposing to raise funds for this. It appears however that recently some accessibility has indeed been implemented in CodeMirror 6, which is current production version. However when I try CodeMirror in the sandbox they only expose 35 lines to the screenreader, which is much less than Monaco. Therefore all the problems of Monaco that I laid out in the previous section apply to CodeMirror 6 as well, but all these problems are even much more severe here, due to much smaller frame size.
Proposed solution
I envision that the fundamental solution to the problems I mentioned above would be a new JavaScript accessibility API that the authors of web apps can easily implement. Let's call it
JSAccessible
for now. Then we'd need to figure out a way for NVDA to talk directly to that API bypassing IAccessible2 - more on that below. To illustrate this proposed accessibility API, for plain text editors we can think of something like this:If we can have Monaco and CodeMirror implement this API, then we can update NVDA to check whether current TextArea in browser happens to implement this API and if so, transparently switch from
IAccessible2
toJSAccessible
and retrieve text contents viaIJSAccessiblePlainTextEditable::getTextInRange(startIndex, endIndex)
call. This way we'd be able to easily have access to the entire buffer inside Monaco/CodeMirror instead of a frame containing only a a few hundred lines. Similarly, for online office solutions another more complicated interface can be developed that would provide NVDA information about text and its formatting, like font size and bold/italic attributes. The challenge here would be convincing Google to implement this interface for Google Docs, but my hope is if we produce a working prototype that works for plain text editors and this project gains some steam, eventually we'll be able to find a live human inside Google who can implement this JSAccessible interface for Google Docs. The next big question is how can we make NVDA to talk to that interface? A browser has an isolated JavaScript VM for every page and it's not easy to have a native application to talk to any code living within that VM. I can think of two approaches here:Discussion
Here I answer some questions ahead of time, that I anticipate to be asked to address potential skepticism.
Why do I need to align anyone instead of just implementing a prototype myself?
This is a huge project proposal that would require cooperation from many parties, like VSCode team, CodeMirror, Electron, potentially also Google Chrome and Google Docs teams. It would greatly help if NVDA devs are aligned, so that I'd be able to readch out to those parties on behalf of NVDA rather than an unknown independent dev - that would improve chances of PRs being accepted and in case of Google with closed source Google Docs it would increase the chances they'll be willing to collaborate. Also without alignment, my work can end up being an NVDA add-on instead of NVDA core feature and as it happens with add-ons, many people either hesitate or don't know how to use add-ons, so the impact of this project will be limited.
We should follow W3C/WCAG standards. This proposal is unacceptable because it doesn't conform to the standards.
We are facing the situation when current set of accessibility standards is not doing a satisfactory job - see lists of problems in the previous sections. In case of Monaco and CodeMirror it's actually the problem of Chromium implementation that prompts web application devs to search for workarounds. In case of Google Docs its developers chose to implement a new screenreader on their side because there was no effective way to communicate with a proper native screenreader from inside browser using existing accessibility APIs. I feel it is up to us, screenreader users and developers: either we are fine to put up with all the drawbacks of status quo - in fact that's one of the reasons why I am submitting this proposal - to see if perhaps most of the people here are not bothered by the problems I outlined above. Or if the problems are bothersome enough, it is up to us to work on a better solution. But in either case we shouldn't be constrained by the set of standards that were developed years ago and that most importantly do not solve accessibility issues that I outlined above.
Why invent a new accessibility API instead of improving IAccessible2/UIA?
Google Chrome, being the most popular browser on the market, has limitations in its implementation of IAccessible2, especially when it comes to large amounts of text in editables. Despite the fact that these issues were reported years ago (links to the issues can be found in Monaco section), not much progress has been done, so it appears that Chrome developers are either not much interested in fixing these issues, or proper fix is going to be technically challenging enough. So working on this direction doesn't appear to be hopeful to address accessibility of Monaco and CodeMirror. As for Google Docs, I only have a limited knowledge of IAccessible2 and cannot say for sure that it's not enough for Google Docs to expose document structure to screenreader the way Microsoft Word and LibreOffice do that. But I assume they had some good technical reasons to go the hard way and implement their own screenreader. However, if anyone is more familiar with IAccessible2 and accessibility layer of Google Docs, please feel free to chime in and tell us if an upgraded version of IAccessible2 can be used to improve Google Docs accessibility.
Some people argue that having access to only a singole line of a document at a time is enough and screenreaders shouldn't provide more sophisticated functionality.
For some people using only simple functions is enough. Others use more advanced functions. A couple of examples of my own use cases that are blocked by current state of accessibility:
Control+Shift+Alt+N Control+Shift+Alt+T
, then I actually need to release Control, Shift and Alt - otherwise it doesn't work, then pressControl+Shift+Alt+N Control+Shift+Alt+T
again. It doesn't have to be that painful. So for some users simple use cases are enough and they might not understand my frustration with my use cases. But given the fact that working in a large IT companies often requires these sophisticated use cases, I would argue that NVDA should be the tool to help visually impaired people to get meaningful employment. That is NVDA shouldn't ignore basic use cases in favor of sophisticated ones, but focus on both and not neglect sophisticated ones.Some people argue that advanced logic should be encoded in application-specific scripts rather than NVDA.
There are multiple issues with this approach:
NVDA+Alt+Up/Down
keystrokes to it because it doesn't treat Insert key as a possible modifier. As a result muscle memory never has a chance to develop, which reduces efficiency.Conclusion
I guess with this feature request I would like to hear what NVDA devs and users think about this project proposal.