nvaccess / nvda

NVDA, the free and open source Screen Reader for Microsoft Windows
https://www.nvaccess.org/
Other
2.11k stars 637 forks source link

Feature Request: Add word navigation to NVDA #16237

Open cary-rowen opened 8 months ago

cary-rowen commented 8 months ago

Is your feature request related to a problem? Please describe.

The word navigation ability of Ctrl+left/right arrow brings convenience to users, but it is not enough.

Ctrl+Arrow obviously fails to navigate by word in the following situations:

  1. CamelCase: "myFirstName","userName"
  2. Pascal case: "DataBaseUser". Of course, it also includes Snake Case: "test_function".

In addition to being a visually impaired programmer, I am also an NVDA user using Chinese. For Chinese content, in some editing controls (such as Notepad), you cannot navigate by word. Because each Chinese character is considered a word, this is no different from word-by-word navigation.

Describe the solution you'd like

I solved all the above problems using @mltony's Word-Nav add-on.

I can freely define the behavior of Ctrl+Arrow, Ctrl+Windows+Arrow, and the add-on can also differentiate left/right Ctrl/Windows which greatly expands the scope of what word navigation can be used for. Since the add-on is left/right aware, this also does not conflict with Windows Virtual Desktop shortcut key.

For Chinese content, this add-on can split and enable navigation by punctuation, which will definitely help in understanding the exact meaning of the text

Describe alternatives you've considered

None

Additional context

None

cary-rowen commented 8 months ago

These cases are probably beneficial to most users, so I'd like to see this feature in NVDA.

Adriani90 commented 8 months ago

I second this. It would also solve problems where applications don‘t handle words in a propper way such as numbers with separators in Libreoffice. This could be a setting in the document navigation settings category, same as we did with ctrl+up and down arrow paragraph handling.

mltony commented 8 months ago

I would be happy to work on this, but this would be blocked by this Chromium bug. I reported it to Google three years ago, but looks like no one from Google has looked into this so far. In WordNav I had to resort to some really ugly hacking to work around this bug, I think NVAccess won't be very happy to accept that ugly code. So I'd like to ask @cary-rowen, @Adriani90, and everyone else who is interested in this: please leave a comment on that chromium issue and also upvote it - search for "Vote count" button on that page - and make sure you're logged in - otherwise your vote won't be counted.

Adriani90 commented 8 months ago

Cc: @aleventhal

LittleStar-VIP commented 8 months ago

I also would like to have this function be included in nvda, which can help me in editing work. currently I am just using the wordnav addon.

cary-rowen commented 8 months ago

Hi @mltony

I'll see what can be done to move this issue forward.

Also cc @derekriemer

Thanks

CyrilleB79 commented 8 months ago

Fully agree with a setting in document navigation panel.

Even if very useful, the differentiation of left/right control and/or combined with Windows key as offered by the add-on could be implemented in a second time if needed by people; it's probably more impacting.

What is missing in the add-on and that would be required to integrate this feature in NVDA are the shift+control+arrow commands. Using the add-on, the mismatch between shifted (selection) commands and unshifted (move) commands is sometimes undesirable.

mltony commented 8 months ago

I am totally aligned on implementing word selection commands - I'd think that should be relatively easy to do, but I am reluctant to spend more time on this unless that egregious chromium bug is fixed. As long as the bug is there - I don't think NVAccess would accept hacky implementation and this whole direction would be a dead end.

LittleStar-VIP commented 8 months ago

just two points to add

  1. we are assuming that the discussion is based on edit cursor right, and not on review cursor
  2. if new hotkeys are to be added, please add the option for us to customize the hotkeys to navigate by words
seanbudd commented 8 months ago

Given this is developer centric and the UX/implementation is unclear we think this is best suited for an add-on for now.

mltony commented 8 months ago

@seanbudd, I respect your opinion. But FWIW there are two more non-developer-centric reasons to incorporate WordNav:

  1. Currently browse mode is using one word definition, notepad is using another one and say MS Word is using the third one. This applies both to word navigation and word selection. As a result, users have to deal with a variety of definitions, and as as result, you can never be sure for example when you press control+shift+rightArrow whether it'll select just the word, or will it also include comma following the word and will it also include the whitespace following the comma - because the result will vary depending on the application. I bet anyone who writes/edits text with NVDA suffers from that. WordNav would bring unified word definition. This would eliminate a frequent source of editing mistakes - and not only for software developers.
  2. In certain situations it is impossible to figure out what word definition is currently being used. An example of that would be a web-based text editor - such as Monaco or CodeMirror - providing its own implementation of control+left/right commands. NVDA typically handles control+right/left keystrokes by sending them to the application and then speaking word at cursor. Crucially, NVDA doesn't have information what word definition was used by end application (being Monaco or CodeMirror editor within webpage), so the discrepancy between word definitions in application and in NVDA lead to problems like words omitted or words spoken twice. Examples: #7605, #10105, #12091 - all of these have been fixed on VSCode side, but they still illustrate the problem. So the point is that NVDA is not provided the information what JS-based editor logic is running inside your browser, so sending control+rightArrow to the browser is like shooting in the dark and hoping that whatever code handles this is going to have the same word definition as NVDA. It is true that this has only been observed to affect developers with VSCode. But as software is migrating more and more towards web technologies, this might become a bigger problem affecting more users. WordNav would solve this problem fundamentally.
  3. And just want to reiterate what @cary-rowen mentioned: having a custom word definition improves efficiency for Chinese users - maybe also applicable to other writing systems. That one is also non-developer-centric.

As for your other argument:

UX/implementation is unclear

Not sure what do you mean by that. WordNav add-on has some UX that is well defined and its implementation is clear, with the exception of that Chromium bug that I mentioned before. It is true that for each feature request people come and throw many ideas, but this happens with each feature request and all of this can definitely be sorted out.

Adriani90 commented 8 months ago

Sean could you please elaborate how you come to this opinion? This is not developer centered at all. Numbers are also words, and this helps in lots of use cases.If you have this opinion, then why did you accept paragraph navigation in document navigation settings at all?Sorry but we repeat such discussions in several places again and again. In my view this doesn’t serve our motivation to improve accessibility.Let us please exchange opinions and arguments in a more professional way, accessibility for blind people is a quite complex matter and we need more details when you guys come to such opinions. Otherwise such comments can really demotivate people who put really a lot of energy in developing solutions for problems which exist since years.uVon meinem iPhone gesendetAm 05.03.2024 um 02:00 schrieb Sean Budd @.***>: Given this is developer centric and the UX/implementation is unclear we think this is best suited for an add-on for now.

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>

Adriani90 commented 8 months ago

And even if it was developer centered, why is this at all an impediment when accepting things into the core? Why did you guys then decide to accept improvements to the Windows terminal via UIA, diffmatch patch etc. into the core, see i.e. Pull requests by Bill Dangler?

seanbudd commented 8 months ago

@mltony @cary-rowen

I'm interested in the use case for Chinese users - my understanding is this is due to poor implementation in applications - is this correct?

Adhering to app-specific word movement behaviour is currently the intended behaviour. The Chromium bug is one issue with the UX, additionally though, creating a second form of word navigation is not exactly intuitive. However if a proposal for implementation can be made that serves a non-developer audience, I think we can reconsider this.

@Adriani90 - the main use cases described here are targeted towards developers, with developer centric ideas such as snake case and pascal case. I'm not sure I understand your frustration here, the issue hasn't been closed, and the discussion is continuing. To avoid people wasting time, we have a process of triaging issues before encouraging pull requests. We have extremely limited developer time - we cannot respond to everything, and we certainly cannot respond to everything in thorough detail. Please see my message here in regards to our primary focuses and concerns of scope creep https://github.com/nvaccess/nvda/issues/16230#issuecomment-1978020782. There is clear benefit to paragraph navigation for non-developers. Ensuring people can use packaged windows applications such as terminal is clearly in scope of a screen reader.

Adriani90 commented 8 months ago

As stated in the other issue, even this proposal coems from a developer, the implementation is not done in an appmodule, so it is a global proposal in the end which impacts all kinds of application. I would understand this argument if the implementation is on an appmodule basis.

The use cases where this helps are as follows:

Adhering to app-specific word movement behaviour is currently the intended behaviour.

This was not the argument when including paragraph handling in NVDA core. Note that word definitions are very different between applications and the developers of many applications did not define a word with accessibility design in mind. Having such a philosophy in NVDA would mean we accept inaccessible design and this in my view is a very bad signal.

Even if there are standards in how to define a word, they are not focused on accessibility and many application still do not follow the standards.

Adriani90 commented 8 months ago

Other use case is already described in the issue description: Asian Languages, but also arabic languages, are directly impacted by bad word implementation in every kind of application. Every kind of thext is not really navigable word by word in such languages. If @cary-rowen says he was able to solve this with the add-on, I am really glad this might improve a lot of use cases for many users worldwide.

cary-rowen commented 8 months ago

I can confirm that @mltony's word navigation add-on does help a lot with reading Chinese. So, this is not a developer-specific requirement.

cary-rowen commented 8 months ago

I completely agree with @Adriani90's argument in the comment above.

cary-rowen commented 8 months ago

Hi @seanbudd

I try to clarify the use case for Chinese users, hope it makes it clear.

Chinese word segmentation is the process of dividing continuous Chinese text into individual words. Since there are no spaces between words in Chinese, as there are in English, word segmentation is a fundamental and crucial step in Chinese natural language processing. For example, in the English sentence "I love cats," the words are separated by spaces, so it's easy to identify each word. However, in the corresponding Chinese sentence "我爱猫," there are no spaces between characters, so word segmentation is needed to identify the words "我" (I), "爱" (love), and "猫" (cats).


This is also what I mentioned in the original description, one reason why Chinese users cannot navigate through Chinese words using Ctrl+left/right arrow. It must be noted that word navigation is possible in Word, but this is very limited. In virtual documents based on browsers or normal notepad, this is impossible. However, the add-on by @mltony provides various rules for splitting Chinese sentences by punctuation, which allows me to quickly navigate to the part I'm interested in reading for longer Chinese sentences. This undoubtedly helps greatly in reading Chinese content, and it's a real efficiency boost.

Thanks Cary

CyrilleB79 commented 8 months ago

@cary-rowen trying to understand the Chinese text reading use case:

For example, in the English sentence "I love cats," the words are separated by spaces, so it's easy to identify each word. However, in the corresponding Chinese sentence "我爱猫," there are no spaces between characters, so word segmentation is needed to identify the words "我" (I), "爱" (love), and "猫" (cats).

But in this example words are just single characters. Thus you do not need anything more than character navigation. Maybe this example was not a good illustration though. If I remember correctly, there are Chinese words that are made up from more than one character, e.g. forrest. Do you confirm this and can you provide such an example?

However, the add-on by @mltony provides various rules for splitting Chinese sentences by punctuation, which allows me to quickly navigate to the part I'm interested in reading for longer Chinese sentences. This undoubtedly helps greatly in reading Chinese content, and it's a real efficiency boost.

Why do you consider splitting sentence by punctuation related to word navigation? To me it seems more related to sentence or phrase navigation as implemented in SentenceNav add-on. Using Could you clarify why WordNav add-on is more suitable than phrase navigation feature from SentenceNav add-ons?

LittleStar-VIP commented 8 months ago

Totally agreed with @cary-rowen I am using the word nav addon for editing when I am writing Chinese or English passage, not just for programming.

CyrilleB79 commented 8 months ago

I would list the following use case that are not only developer centric:

Regarding snakeCase, it is sometimes used outside of dev tasks, e.g. nicknames in forums. I acknowledge that it is not common. But on the other side, we should take into account that such word separation is visible at a glance for sighted users who can then put the cursor immediately with the mouse where they want. Word navigation is a facility to help positionning the cursor at the wanted position as soon as possible. And in my view, specific word navigation rules can help a blind person achieving this task more easily. At last, please also note that snakeCase is already supported in NVDA in a built-in dic rule to help the synthesizer separate each part of a snakeCased words.

The example of number reading in Libre Office is not very clear to me. Is it due to a lack of information provided to the screen reader from Libre Office, or to a bug in LO? If this is confirmed, we cannot expect NVDA to fix all the bugs of others apps, the more when the other apps are also free and open source and can be contributed directly. To clarify this issue, I'd recommend to open a specific issue (with STR, log, etc) so that we do not cluter this discussion with very specific information; or if such an issue already exists, provide here the link.

Regarding the UX, as stated before, my suggestion is to add only one option to define word navigation rule in the Documentation navigation panel as done for paragraphs, with the default value being "Defined by the application". This part of the UX seems the less controversial to me and would help clarify and progressing this issue. Additional enhancements such as additional navigation gestures for each mode or changing the default rule to something more suitable (e.g. depending on the object type or the application) should not be discussed here further before the first part is merged to keep discussion clearer and more progressive (i.e. not mixing short term implementation design with long term ideas).

cary-rowen commented 8 months ago

Hi @CyrilleB79

But in this example words are just single characters. Thus you do not need anything more than character navigation. Maybe this example was not a good illustration though. If I remember correctly, there are Chinese words that are made up from more than one character, e.g. forrest. Do you confirm this and can you provide such an example?

Sorry, this example is really not very good in the more general concept of "Chinese word segmentation". My attempt to explain using a single character will cause you to misunderstand.

You are right. In Chinese, if "I love forest" is word-split, it will be split into three parts. "我" (I), "爱" (love), and "森林" (forest).

If I use the default ctrl+Left/Right word navigation behavior, I can only navigate by character, which is no different than if I press Left/Right. The above sentence will be divided into four parts.

Why do you consider splitting sentence by punctuation related to word navigation? To me it seems more related to sentence or phrase navigation as implemented in SentenceNav add-on.

There is no doubt that Sentence-nav is also a benefit for Chinese users, but Word-nav’s multiple rules cover more cases, There is a rule called "Fine word navigation - good for programming":

  1. The use cases for programming are already stated in the original description.
  2. Even if I don't write code, I just read Chinese content, this rule can treat symbol as a word, which is what I need. Note: This is different from reporting symbols when navigating by word.

Because the add-on can customize gestures for different rules, this allows me to choose the appropriate rule for the situation without having to go to the options to switch rules.

CyrilleB79 commented 8 months ago

You are right. In Chinese, if "I love forest" is word-split, it will be split into three parts. "我" (I), "爱" (love), and "森林" (forest).

Thanks for this new more representative example. And how does WordNav add-on help in splitting correctly this sentence into words? Which navigation rule of the add-on are you using to achieve such word navigation?

If I use the default ctrl+Left/Right word navigation behavior, I can only navigate by character, which is no different than if I press Left/Right. The above sentence will be divided into four parts.

Why do you consider splitting sentence by punctuation related to word navigation? To me it seems more related to sentence or phrase navigation as implemented in SentenceNav add-on.

There is no doubt that Sentence-nav is also a benefit for Chinese users, but Word-nav’s multiple rules cover more cases, There is a rule called "Fine word navigation - good for programming":

Sorry, my question are just to understand the benefit of WordNav add-on for Chinese reading, not for other use case such as programming which are more obvious to me.

  1. Even if I don't write code, I just read Chinese content, this rule can treat symbol as a word, which is what I need. Note: This is different from reporting symbols when navigating by word.

Is it the same concept than "a" not having the same sound when dealing with the letter "a", such in "a, b, c" than when dealing with the word "a" such as in "a cat"?

If yes, isn't the normal word navigation not enough. Could you illustrate with and without the add-on what happens when you navigate by word in an exemple sentence such as "I love the forrest"? Thanks.

cary-rowen commented 8 months ago

Sorry, I accidentally introduced the concept of "Chinese word segmentation", I want to clarify some points.

  1. The above examples are intended to explain what Chinese word segmentation is.
  2. Due to the current shortcomings of Chinese word segmentation, our efficiency in reading long content is very low.
  3. Why Word-Nav? Because Word-Nav improves the inefficiency problem caused by the defects of Chinese word segmentation, that is, it can split Chinese content according to symbol positions.
  4. How does this differ from Sentence-Nav phrase splitting? My answer is: including "phrases", but covering a wider range of cases than phrases. I think it is impossible to completely distinguish programming needs from ordinary needs here.

I'll give a specific example below to illustrate the advantages of this add-on over the default behavior of Ctrl+Left/Right.

try it

Enter the following Chinese sentence in windows10 legacy notepad. Cary 说: “玛丽有一只宠物狗,那只狗是白色的” Translated into English it is: Cary said: "Mary has a pet dog, and the pet dog is white."

  1. Add two blank lines before the text.
  2. For each test, place the caret at the very beginning of the text.
step number What I did What does NVDA report(Separate with "arrow" symbol) Number of key presses
1 Default behavior: Press Ctrl+RightArrow repeatedly Cary →说 → 玛→丽→有→一→只→宠→物→狗→那→只→狗→是→白→色→的 17
2 Behavior of Word-Nav: Set rule to "Fine word navigation - good for programming", I assigned the rule to Right Ctrl+Arrow. Press Right Ctrl+RightArrow repeatedly Cary →说→:→“→玛丽有一只宠物狗→,→那只狗是白色的→” 8
3 Phrase navigation behavior for Sentence-Nav: Press Win+Alt+Down Arrow repeatedly Cary 说→玛丽有一只宠物狗→那只狗是白色的 3

Some conclusions:

CyrilleB79 commented 8 months ago

@cary-rowen, many thanks for this clear example. It's a very good illustration of the issues encountered when reading Chinese that helps a lot non-Chinese speakers (like myself) to understand the workaround solution found by the Chinese community to read some text step by step.

I would not call it a word navigation, rather a chunk navigation... This use case needs to be clearly documented in the User Guide (and even in the Change Log) to maximize its usefulness.

Adriani90 commented 8 months ago

Just want to mention that not only asian or arabic language speakers are impacted. For example for me working in a developing finance institution on projects in Asia and East Europe, it is really difficult to learn such languages properly. Given I am not allowed to use add-ons in a Citrix environment, but only on the local machine at work, it is even mostly impossible to read reports from the projects in these languages and it makes my job really difficult. It is just a small example, but I think in general blind people who want to learn these languages need any small aid that helps in navigating coresponding texts.

CyrilleB79 commented 8 months ago

@Adriani90 for now, only the Chinese use case has been explicitly explained here. The Arabic use case hasn't. And neither have the other languages which you are working with. If you know some languages with some specificity regarding word navigation, please provide a detailed example as done in https://github.com/nvaccess/nvda/issues/16237#issuecomment-1978763950.

Note that I am convinced myself of the benefit to include a word navigation customization feature in NVDA core. I am just inciting you and other people to provide more explicit example of use case to make your comments more convincing towards NV Access and the community in general.

seanbudd commented 8 months ago

Thanks for the additional information - I think the cases described here are sufficient to demonstrate that this is useful for a wider group of users, particularly international users.

LittleStar-VIP commented 8 months ago

@CyrilleB79 Although I am not that familar with, I believed the structure of Japanese or Korean are similar to Chinese, i.e. thereis no space between word.

cary-rowen commented 8 months ago

Glad my explanation was useful to the community and thanks @CyrilleB79 for the great question.

cary-rowen commented 8 months ago

Hi @mltony

I tried pushing the status of this Chromium bug and its severity has been raised to S1.

mltony commented 8 months ago

cary-rowen, Thanks a bunch! Hope now someone from Google will fix it. It's always good to have connections in Google!

aleventhal commented 7 months ago

We are still looking for feedback in the Chromium bug. The engineer working on it could not repro using the steps, and believes the issue was fixed way back in Chromium 102, which is a long time ago. If someone wants to add more feedback to that bug it will help move things along.

cary-rowen commented 7 months ago

Hi aleventhal,

The bug is reproducible, I'm keeping a close eye on this and will update here if there is any further news.

Best, Cary

cary-rowen commented 7 months ago

According to my testing, this bug does behave as expected sometimes, but not always.

aleventhal commented 7 months ago

That's helpful. Are you able to add that info to the Chromium bug? That way the engineer can ask more questions if necessary.

cary-rowen commented 7 months ago

Not exactly the same. The goal of #16237 is more like a request to merge @mltony's word-nav into NVDA core.

It is similar to sentence navigation, and the common point is to use punctuation marks to achieve quick navigation of Chinese content.

该请求着重于让 NVDA 具有对中文内容的单词拆分能力,这将会影响 numpad4和numpad6。 The ultimate goal is to support splitting of Braille content.

I hope to resubmit an issue in the near future to elaborate on this matter

mltony commented 6 months ago

I just found one more blocker for improved word navigation: setting caret position doesn't work in VSCode. I filed microsoft/vscode#210842. Looking at their last reply it seems they are not planning to fix this. CC: @isidorn, @meganrogge.