Delayed character descriptions in Nepalese

Qchristensen commented 4 months ago

Steps to reproduce:

Enable "Delayed descriptions for characters on cursor movement" in NVDA's speech settings.
Ensure language is set to Nepali. (User is using eSpeak-NG synthesizer)
Navigate through text in Nepalese and pause on each character
Press Numpad 5 three times quickly

Actual behavior:

Advised by user that neither command works now although these still work in English and both used to work in Nepalese in 2022.4 and earlier.

User sent log pressing numpad 5 three times quickly, relevant fragment reads:

IO - inputCore.InputManager.executeGesture (20:13:30.877) - winInputHook (7452): Input: kb(desktop):rightArrow IO - speech.speech.speak (20:13:30.895) - MainThread (11992): Speaking ['न्ति'] IO - inputCore.InputManager.executeGesture (20:13:31.710) - winInputHook (7452): Input: kb(desktop):numpad5 IO - speech.speech.speak (20:13:31.726) - MainThread (11992): Speaking ['कान्तिपुर '] IO - inputCore.InputManager.executeGesture (20:13:31.878) - winInputHook (7452): Input: kb(desktop):numpad5 IO - speech.speech.speak (20:13:31.908) - MainThread (11992): Speaking [CharacterModeCommand(True), 'क', EndUtteranceCommand(), 'ा', EndUtteranceCommand(), 'न', EndUtteranceCommand(), '्', EndUtteranceCommand(), 'त', EndUtteranceCommand(), 'ि', EndUtteranceCommand(), 'प', EndUtteranceCommand(), 'ु', EndUtteranceCommand(), 'र', EndUtteranceCommand()] IO - inputCore.InputManager.executeGesture (20:13:32.015) - winInputHook (7452): Input: kb(desktop):numpad5 DEBUG - characterProcessing.CharacterDescriptions.init (20:13:32.054) - MainThread (11992): Loaded 26 entries. IO - speech.speech.speak (20:13:32.054) - MainThread (11992): Speaking [CharacterModeCommand(True), 'क', EndUtteranceCommand(), 'ा', EndUtteranceCommand(), 'न', EndUtteranceCommand(), '्', EndUtteranceCommand(), 'त', EndUtteranceCommand(), 'ि', EndUtteranceCommand(), 'प', EndUtteranceCommand(), 'ु', EndUtteranceCommand(), 'र', EndUtteranceCommand()] DEBUG - speech.manager.SpeechManager._handleIndex (20:13:32.055) - MainThread (11992): Unknown index 36, speech probably cancelled from main thread.

Expected behavior:

These commands should echo the character description when working in Nepalese, as they do in English.

NVDA logs, crash dumps and other attachments:

System configuration

NVDA installed/portable/running from source:

NVDA version:

NVDA 2023.3.3 - also previously reported when 2023.1 came out.

Windows version:

Windows version: Windows 10 21H2 (10.0.19044) workstation AMD64

Name and version of other software in use when reproducing the issue:

Other information about your system:

Other questions

Does the issue still occur after restarting your computer?

Have you tried any other versions of NVDA? If so, please report their behaviors.

If NVDA add-ons are disabled, is your problem still occurring?

Does the issue still occur after you run the COM Registration Fixing Tool in NVDA's tools menu?

CyrilleB79 commented 4 months ago

Hi

The issue's description is not precise enough I am not able to reproduce exactly STR: Where is the test made? Notepad, browser? An application would be nice to be able to reproduce the issue Actual behaviour: "The feature is not working" is not a helping description at all to try to reproduce. Indicating for which step if something or not is needed. And indicating what is reported in case it differs from what is expected is needed.
Some fields in the description are empty: Mainly OS used, if it has been tried without add-ons.
Full logs (not only extracts) from NVDA 2023.3.3 (not working) and 2022.4 (working) in debug mode would be helpful. Even if possible, logs from a clean config and with NVDA's language in English would be better. For this, rather launch: nvda --lang=en -c %tmp%\nvdaTestConfigFolderWhichDoesNotYetExist Of course, if NVDA's language causes issue to reproduce, ignore this request.
On my side, I am not able to reproduce exactly. I get the same behaviour between NVDA 2022.4 and 2023.3.3. Pressing numpad5 three times seems to spell correctly with character description. But delayed character description does not work on all characters of the word "कान्तिपुर", except for the last one "र". Maybe because all the characters of the word are compound Unicode characters, except the last one.

@Qchristensen do you think that you can have the missing information to help investigation?

Or is anyone else able to reproduce and could help giving more information?

Thanks.

zstanecic commented 4 months ago

Hi @CyrilleB79 and @Qchristensen, This is a wider problem, which cannot be answered thoroughly by giving logs and issue templates. Let me explain this problem. The thing, why the delayed character descriptions don't work is that NVDA gets two or three characters sometimes when navigating by characters with left and right arrow keys. It also depends in which application you are trying to spell the text. This is the problem in NVDA how it process devanagari, not in the synthesizers itself. I can reproduce this with Nepali RHVoice, as well as with espeak. In Microsoft word we have the word: कान्तिपुर When using arrow keys, we get the actual syllables of this word. का न्ति पु and finally र The syllable न्ति has a halanta character attached, i.e virama. This character marks the apsence of true schwa or o sound realized in nepali speech. When trying to spell this in notepad in windows 11, i cannot spell this anyway. I get the individual characters with their incorrect positions. so, for example of न्ति i get only the devanagari na character. I think that it has something to do with Notepad and Uia. So when we want individual spelling, we need to have na, halant, ta, i when going with left and right arrow keys. I hope that i have explained this where the problem is.

zstanecic commented 4 months ago

Note that Microsoft word and browsers get the true syllables when going with the arrow keys.

CyrilleB79 commented 4 months ago

I do not know anything about devanagari characters. In the examples provided, it seems that they use many combinations to create a compound characters from single Unicode characters: normal characters, characters to combine two other ones, combining diacritic, etc.

Would there be a way to provide examples with characters that may be more understandable for myself, something more western centered? I am thinking to:

the character ""
the character "👨‍🦯" compound from two emojis (man and white cane) linked by a zero-width joiner.
the character "é" containing "e" and the combining diacritic "acute accent" (created from the single character with: "import unicodedata;unicodedata.normalize('NFD', 'é')")

@zstanecic, since you have understood better than I the issue, are you able to reproduce the difference between 2022.4 and 2023.3.3?

There are probably various problems covered in this issue. But at least the regression should be looked at in the first place. On the opposite, the issue with compound characters not behaving the same way depending on the application is something older and there is probably already a dedicated issue.

zstanecic commented 4 months ago

Hi @CyrilleB79 and all, There is no difference between 2022.4 and this version. The compound characters were an issue from the version 2019.3 There is no something western centered. You need to understand eastern centered things to understand this issue. We have the same issue in arabic. We cannot spell fully the following text with arrow keys: بِسْمِ ٱللّٰهِ ٱلرَّحْمٰنِ ٱلرَّحِيمِ⁩ بِسْمِ ٱللّٰهِ ٱلرَّحْمٰنِ ٱلرَّحِيمِ⁩ بِسْمِ ٱللّٰهِ ٱلرَّحْمٰنِ بِسْمِ ٱللّٰهِ ٱلرَّحْمَٰنِبِسْمِ ٱللّٰهِ ٱلرَّحْمَٰنِwhen arrowing with lefta and right arrows, you will get two characters spelled oout.

zstanecic commented 4 months ago

Note, here we have the arabic quranic vocalization as an example.

Adriani90 commented 4 months ago

I think I got this problem. Working with laptop layout: The fonet ic description is available when pressing nvda+dot twice for many characters, but not for all of them. In case of Nepali and Arabic ther are long descriptions, but I don't understand them. Which descriptions are expected exactly? Is there a data base with official fonetic character descriptions for these languages? When I am looking at the log snipped @Qchristensen posted, there is only one character without long description, which is र. The other characters seem to be long described but not sure if this description is just the elements which compound the whole character or if this is an usual word in that language. If the user expects a word for the fonetic description, then we should look if there is a database at all for this in the coresponding language.

Nevertheless, we could already try to include the same long description of nvda+dot twice in the left and arrow keys navigation when the coresponding checkbox is enabled in the speech settings in NVDA for long description. This is missing for many languages probably because the person who implemented this was not sure if the current long descriptions make sense at all.

zstanecic commented 4 months ago

Hi Adriani,

We have already long descriptions. There is actually a problem.

The phonetic descriptions don’t work, because the compound characters are not handled like two characters, but not like one individual character.

From: Adriani90 @.> Sent: Friday, February 23, 2024 1:29 PM To: nvaccess/nvda @.> Cc: Zvonimir Stanečić @.>; Mention @.> Subject: Re: [nvaccess/nvda] Delayed character descriptions in Nepalese (Issue #16170)

I think I got this problem. Working with laptop layout: The fonet ic description is available when pressing nvda+dot twice for many characters, but not for all of them. In case of Nepali and Arabic ther are long descriptions, but I don't understand them. Which descriptions are expected exactly? Is there a data base with official fonetic character descriptions for these languages? When I am looking at the log snipped @Qchristensen https://github.com/Qchristensen posted, there is only one character without long description, which is र. The other characters seem to be long described but not sure if this description is just the elements which compound the whole character or if this is an usual word in that language. If the user expects a word for the fonetic description, then we should look if there is a database at all for this in the coresponding language.

Nevertheless, we could already try to include the same long description of nvda+dot twice in the left and arrow keys navigation when the coresponding checkbox is enabled in the speech settings in NVDA for long description. This is missing for many languages probably because the person who implemented this was not sure if the current long descriptions make sense at all.

— Reply to this email directly, view it on GitHub https://github.com/nvaccess/nvda/issues/16170#issuecomment-1961241792 , or unsubscribe https://github.com/notifications/unsubscribe-auth/ACVCDE777AZLWS23K4HCVDTYVCDQ5AVCNFSM6AAAAABDFTEHHWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNRRGI2DCNZZGI . You are receiving this because you were mentioned. https://github.com/notifications/beacon/ACVCDE2N25UOB7GDZ62JH73YVCDQ5A5CNFSM6AAAAABDFTEHHWWGG33NNVSW45C7OR4XAZNMJFZXG5LFINXW23LFNZ2KUY3PNVWWK3TUL5UWJTTU4YWMA.gif Message ID: @. @.> >

CyrilleB79 commented 4 months ago

@zstanecic wrote:

Hi @CyrilleB79 and all, There is no difference between 2022.4 and this version.

but in the initial description, @Qchristensen wrote:

Advised by user that neither command works now although these still work in English and both used to work in Nepalese in 2022.4 and earlier.

So either the reporting of @Qchristensen is incorrect, or the spelling issue with combined characters mentioned by @zstanecic is not the issue reported initially here.

I agree of course that the spelling issue with combined characters is real, and is also reproducible with Latin alphabet by the way. But I need to understand what was useful to the user in 2022.4 that has been broken in subsequent versions of NVDA.

By the way @zstanecic, I would be quite surprised if there is not already another GitHub issue describing the spelling issue with combined characters.

zstanecic commented 4 months ago

Hi,

There is an issue, but actually nobody cared, and i cannot recal the numer at the moment.

Showharda commented 4 months ago

Hi!

NVDA has issues regarding the announcement of spellings in compound Nepali characters for years now and delayed description of characters was not working in cases of compound characters until 2022.4 but it used to work in case of simple characters and it used to announce characters of the word with examples when Numpad 5 key was pressed three times in quick succession until then but delayed description of characters has stopped working since 2023.1 even in simple characters and the Numpad 5 key has stopped announcing Nepali characters with examples even if it is pressed three times in quick succession since then.

Showharda commented 3 months ago

I request you to respond.

nvaccess / nvda