nvaccess / nvda

NVDA, the free and open source Screen Reader for Microsoft Windows
Other
2.09k stars 629 forks source link

Word navigation in Chrome does not read some words #12091

Open isidorn opened 3 years ago

isidorn commented 3 years ago

NVDA version 2020.4 Chrome version: 88.0.4324.182

  1. Go to www.google.com
  2. Type the following input "pop + cat tram"
  3. Put the cursor on the start of that input. Press ctrl + right to navigate to next word, notice that the cursor stops at | location

|pop |$ cat |tram|

This is how Chrome does word navigation. However NVDA does not read the word cat because it is skipped over. Could NVDA introduce some workaround so all the words get read.

Here's the underlying Chrome issue https://bugs.chromium.org/p/chromium/issues/detail?id=1180807

isidorn commented 3 years ago

As @joanmarie pointed on the VS Code issue:

"Dominic Mazzoni (@minorninth of Google) told me that for the browser (and arguably Gtk and ....) to always give the correct word at offset, taking into account the previous position, direction of navigation, etc., can be quite tricky and also introduce localization issues (he mentioned some example in Chinese if memory serves me). He suggested that I change Orca to keep track of the previous caret location. Then, when the caret-moved accessibility event is emitted, Orca could get the text in between the old offset and the new offset and speak it. It turns out that's not all that hard to do (at least it wasn't in Orca). So.... Maybe the thing to do IS get the NVDA developers to do the same thing. 🤷‍♀️"

So is it possible that NVDA does the same as Orca and listens on the caret-moved accessibility event?

fyi @leonardder @michaelDCurran

LeonarddeR commented 3 years ago

I think this would mean a pretty drastic workaround in the code that announces words when moving. I'd be curious to know what mozilla people like @jcsteh and @MarcoZehe think about this. Probably better to fix this at the accessibility API level.

isidorn commented 3 years ago

We seem to have found a workaround in vscode: and that is to let navigation be handled by Chrome instead of our custom approach. However I believe that my suggestion for NVDA still makes sense. As in that case we could customise word navigation to what makes most sense for developers and it will still work nicely with NVDA (like it works nicely today with Orca).

jcsteh commented 3 years ago

I'd be curious to know what mozilla people like @jcsteh and @MarcoZehe think about this.

Pragmatically, it's a pretty big hack, but I guess it could work and it probably wouldn't "hurt" anyone (except that the caret and the review cursor would be inconsistent with respect to word navigation). I probably wouldn't want to do it for lines, though; that's a lot riskier. I can't comment on the appetite for implementing this, though; that's no longer my call.

FWIW, I did a bunch of work in Firefox last year to improve word retrieval so it's consistent with keyboard word navigation (assuming the author hasn't overridden the keys). As a result, the original steps to reproduce here work correctly in Firefox. I do believe fixing it in the browser is the more "correct" solution. That said, it is definitely really hard to get right (there are probably still obscure cases which break in Firefox), and at the end of the day, if this hurts users enough, pragmatism may dictate a hacky solution.

LeonarddeR commented 3 years ago

I looked at the str again. It is definitely Chrome going mad here, since thee word navigation is inconsistent.

When going forward:

  1. Cursor stops at |pop |+ cat |tram
  2. NVDA announces "pop plus", "pop plus", "tram"

When going backwards:

  1. Cursor stops at |pop + |cat |tram
  2. NVDA reads "tram", "cat", "pop plus"

So, it is NVDA that behaves consistent across cases, based on what it receives from IAccessible2.

Also, as a workaround, I'm still not sure how to do this, as NVDA's word navigation reads the current word you're at, not the word you passed over. Therefore, the difference in caret position before and after the cursor movement doesn't make sense to how NVDA behaves. AN explanation just to make sure:

  1. Position your cursor at the start of "pop"
  2. Press ctrl+right arrow
  3. NVDA should announce "plus cat", but how ought we know?
isidorn commented 3 years ago

@leonardder yes Chrome does not behave predictably.

Aha I was not aware of this difference between Orca and NVDA when it comes to what is being read. Is this configurable by NVDA? Or NVDA always read the word in front. If that is the case I am not sure how the heuristic from @joanmarie could help here.

LeonarddeR commented 3 years ago

Is this configurable by NVDA? Or NVDA always read the word in front.

The latter. NVDA always reads the word at the caret. That's how all Windows screen readers behave, I believe. it would be interesting to know what JAWS does here.

mltony commented 3 years ago

I solved this problem in a different fashion in my add-on WordNav: https://github.com/mltony/nvda-word-nav/ Instead of sending Control+Left/Right to the application and then trying to guess how the application splits lines into words, I do splitting myself on nVDA's side and then just update the cursor to the new location. This way cursor movement is guaranteed to be consistent with what words are spoken. This approach tends to work pretty well in most applications, including VS Code. Also, the fundamental problem I see with current approach is that I see web-based text editors that redefine standard behavior of Control+Left/Right keystrokes. Examples are jupyter and various online coding websites. So with current approach, word reporting on those websites cannot be made consistent, since NVDA has no way of guessing word splitting rules on a given website. Anyway - just a thought.

Adriani90 commented 1 year ago

Two chromium bugs that are still worthy to track are as follows: https://bugs.chromium.org/p/chromium/issues/detail?id=1063815 and https://bugs.chromium.org/p/chromium/issues/detail?id=1181643

Please comment there as well if you have any relevant input.