nvaccess / nvda

NVDA, the free and open source Screen Reader for Microsoft Windows
Other
2.09k stars 628 forks source link

"Speak typed words" behaves like "speak typed characters" when entering Asian text in Notepad and other edit fields and documents #2762

Open nvaccessAuto opened 11 years ago

nvaccessAuto commented 11 years ago

Reported by nvdakor on 2012-10-30 16:34 Hi, Using 2012.3 Beta 3 with Windows 8 with Korean IME installed. When typing Asian characters, "speak typed words" behaves like "speak typed characters" in that all chars are announced.

To duplicate:

  1. Open Notepad, Wordpad or similar word processors and/or edit fields on websites.
  2. Switch the keyboard layout to any Asian languages (Korean, for example).
  3. Turn speak typed characters to off and turn speak typed words to on. Then start typing some words e.g. "hi" in Korean, which is typed "dkssudgktpdy").
  4. Next, turn speaked typed characters on and set speak typed words to off. Type the same phrase as above. You'll notice that both scenarios behave the same. This was confirmed on at least two systems - one running Windows 8 and another running Windows XP. Thanks.
nvaccessAuto commented 11 years ago

Comment 2 by mdcurran on 2012-10-31 04:44 I think some research will need to be undertaken to find out exactly how all affected asian input methods should handle this. There are major differences between the languages. Also the implementation of speak typed words was not designed to handle this, so it will have to be rethought based on the new asian input support. Changes: Milestone changed from None to near-term

nvaccessAuto commented 10 years ago

Comment 3 by blindbhavya on 2014-09-07 08:20 Hi. A similar issue was reported and fixed for v 2014.3 If this issue is the same one, and is fixed in 2014.3, then this should be marked as duplicate and closed. I don't exactly remember the other ticket number that I said was fixed.

jcsteh commented 8 years ago

The InputComposition NVDAObject intentionally treats characters as words. This is correct for Chinese (where every character is a word), but it isn't correct for other languages. We should determine where this doesn't apply and find a way to exclude these cases (e.g. Unicode ranges).

CC @JosephSL, @nishimotz.

nishimotz commented 8 years ago

In my opinion, 'speak typed words' option is not relevant for input composition. Japanese language users, in general, do not use 'speak typed words' during input composition.

The conversion operation of Japanese IME is usually assigned to space key, so the announcement of candidates is something similar to the typed words during the Japanese input composition. If the candidate contains several words, reviewing for each word is possible within the composition session. (actually the composition unit is not equal to the Japanese word, in terms of linguistics.)

Although it is not common, skilled Japanese users just disable 'Speak typed characters' and 'Report changes to the reading string' and only enable the report of selected composition string. It is something similar to the latin character input, with only enabling 'Speak typed words'.

For Chinese language, word segmentation is mentioned in #4075. Word segmentation of Japanese language is not trivial as well.

As far as I have tested few years ago, Windows Uniscribe API cannot handle Japanese correctly. Microsoft Word seems to have own text analysis system for Japanese word segmentation. Mozilla Firefox uses full-shape punctuation characters for psudo-word segmentation. Internet Explorer and WordPad treats successive Katakana phonetic characters as a psudo-word.

josephsl commented 8 years ago

I'll ask Korean users if they'd like to offer some additional info on this. For Hangul input, it is like entering Latin characters in that space and other whitespace chars are used to separate words. For Korean, after entering some characters, Korean IME is smart enough to move to the next character (hence, certain Unicode char range is devoted to storing all possible consonant/vowel combinations for Hangul chars). For Hanja input, the setting that's useful is announce short descriptions. Because the same char (not the letter, but the pictorial shown) could have differing meanings, it is always helpful for people to know the short description/possibilities. These days, Hanja (Chinese characters) are used in Korean text to clarify meanings of words (often found inside parentheses as part of Hangul text). For Korean users, the important thing is for NVDA to support entry of Hangul better. Thanks.

JangChangHwan commented 8 years ago

I'm a korean. In korean language, Chinese characters are basicly used as characters, not as words. Some cases, a chinese character can be used as a word, but it's not general. So, as a korean NVDA user, I think Notepad and such a edit control are really inconvenient to korean users. I hope that NVDA has a option which can select whether inputconposition object treats characters as characters or as words. This problem will be a critical obstacle for more korean blinds to use NVDA.

LeonarddeR commented 4 years ago

@josephsl: Do you think #8110 can solve or improve this somehow? Pretty curious what its impact on the behavior for input composition is anyway.

josephsl commented 4 years ago

Hi, I’m not sure on that front, as I don’t really actively test Korean translation and input mechanisms these days. Thanks for not forgetting about this issue.

From: Leonard de Ruijter notifications@github.com Sent: Wednesday, October 2, 2019 10:26 PM To: nvaccess/nvda nvda@noreply.github.com Cc: Joseph Lee joseph.lee22590@gmail.com; Mention mention@noreply.github.com Subject: Re: [nvaccess/nvda] "Speak typed words" behaves like "speak typed characters" when entering Asian text in Notepad and other edit fields and documents (#2762)

@josephsl https://github.com/josephsl : Do you think #8110 https://github.com/nvaccess/nvda/pull/8110 can solve or improve this somehow? Pretty curious what its impact on the behavior for input composition is anyway.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/nvaccess/nvda/issues/2762?email_source=notifications&email_token=AB4AXEE4A2PZI5YUKPCD3R3QMV66TA5CNFSM4CH7LXS2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAHBA4A#issuecomment-537792624 , or mute the thread https://github.com/notifications/unsubscribe-auth/AB4AXEAMWEC3R35UX2PT7E3QMV66TANCNFSM4CH7LXSQ .

Adriani90 commented 4 years ago

@josephsl could you please thest with last NVDA alpha if this is still an issue? Or could you ask on the translations list for someone to test if this has been improved now? Thanks.

josephsl commented 4 years ago

Hi, still the same in latest alpha (2020), but it appears this is due to IME window and the fact that Asian characters are composed of constituent characters (Korean, for example). For now the solution is turning off “Report changes to the composition string" From input composition category, but it doesn’t resolve the underlying problem due to how these languages are shown on screen (Korean does use punctuation and spaces to separate words, whereas Chinese and Japanese may not do so). Thanks.

From: Adriani90 notifications@github.com Sent: Saturday, April 25, 2020 7:07 AM To: nvaccess/nvda nvda@noreply.github.com Cc: Joseph Lee joseph.lee22590@gmail.com; Mention mention@noreply.github.com Subject: Re: [nvaccess/nvda] "Speak typed words" behaves like "speak typed characters" when entering Asian text in Notepad and other edit fields and documents (#2762)

@josephsl https://github.com/josephsl could you please thest with last NVDA alpha if this is still an issue? Or could you ask on the translations list for someone to test if this has been improved now? Thanks.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/nvaccess/nvda/issues/2762#issuecomment-619385068 , or unsubscribe https://github.com/notifications/unsubscribe-auth/AB4AXEHBWV2US3L6ZD35K4LROLVA7ANCNFSM4CH7LXSQ .

Adriani90 commented 1 year ago

cc: @khsbory, @ungjinPark, @dnz3d4c, @larry801 maybe this could be a good point to start a team work in asian communities to develop sollutions for such issues and similar problems in IME window regarding compound characters.