nvaccess / nvda

NVDA, the free and open source Screen Reader for Microsoft Windows
https://www.nvaccess.org/
Other
2.11k stars 637 forks source link

Figure number is misinterpreted as date in website #12584

Closed AliciaHof closed 3 years ago

AliciaHof commented 3 years ago

Steps to reproduce:

Open https://test.geogebra.org/~authoring/playground/figure_numbers.html This page was created soley to test the bug Page contains only text: "Interactive Figure 2.6.10 Interactive Figure 3.4.2"

Actual behavior:

"Interactive Figure 2.6.10" is read as "Interactive Figure Februray sixth ten" "Interactive Figure 3.4.2" is read as "Interactive Figure three point four point two"

Expected behavior:

"Interactive Figure 2.6.10" is read as "Interactive Figure two point 6 point 10" "Interactive Figure 3.4.2" is read as "Interactive Figure three point four point two"

System configuration

NVDA installed/portable/running from source:

installed

NVDA version:

2020.4

Windows version:

Windows 10 home

Name and version of other software in use when reproducing the issue:

Chrome Version 91.0.4472.114

Other information about your system:

Other questions

Does the issue still occur after restarting your computer?

yes

Have you tried any other versions of NVDA? If so, please report their behaviors.

yes, older version - same behaviour

If add-ons are disabled, is your problem still occurring?

I don't have any add-ons installed (I'm a web dev testing my content)

Does the issue still occur after you run the COM Registration Fixing Tool in NVDA's tools menu?

yes

AAClause commented 3 years ago

I'm unable to reproduce the issue. What speech synthesizer are you using? What is your language variant?

AliciaHof commented 3 years ago

Language: English, en Synthesizer: Windows OneCore voices I think those were the default settings when I installed NVDA

AAClause commented 3 years ago

OK, I was able to reproduce with other languages and synthesizers too. Hmm, I have the impression that the option "Trust voice's language when processing characters and symbols" (in the Speech settings) is buggy" when the NVDA interface is set to the voice language (except with eSpeak?)... Or perhaps I got my wires crossed ;)

AliciaHof commented 3 years ago

Thanks for checking this so quickly! What's the default setting for most users? I'm not a user, I'm developing accessible web content and am testing my content with NVDA.

geoffshang commented 3 years ago

In my experience, this sort of thing is caused by the TTS engine (speech synthesiser) trying to be too clever.

I've never found a way to prevent this. If someone knows of a way, please do humanity a favour and tell us all how.

XLTechie commented 3 years ago

You can do regular expressions to cover these in the voice dictionary.

For example, OneCore pronounces NVDA version numbers as dates as well.

OneCore seems to think that periods are date separators. I have no idea if there is any language for which that is the case.

AAClause commented 3 years ago

What's the default setting for most users?

For me there are as many configurations as there are users!

You can do regular expressions to cover these in the voice dictionary.

Yes. But it's not for everyone :)

OneCore seems to think that periods are date separators.

This is not the only one. For me it is the same with Nuance Vocalizer , Nuance Vocalizer expressive, Microsoft Speech API version 5, Code Factory Vocalizer and Acapela TTS. This is not the case with Espeak.

Brian1Gaff commented 3 years ago

This sounds like a classic Microsoft assumption in the synth to me. We get a lot of this sort of thing, Microsoft and others really need a turn of popular abbreviation expansion in their systems... British Post codes often say for example sb1 2 New York when its 2NY.

Espeak is only the default on windows 7, and this has another annoying trait of reading VI as used as an abbreviation for visual impaired as roman 6. Brian

@. Sent via blueyonder. Please address personal E-mail to:- @., putting 'Brian Gaff' in the display name field. Newsgroup monitored: alt.comp.blind-users ----- Original Message ----- From: "alicia" @.> To: "nvaccess/nvda" @.> Cc: "Subscribed" @.***> Sent: Thursday, June 24, 2021 10:42 AM Subject: Re: [nvaccess/nvda] Figure number is misinterpreted as date in website (#12584)

Thanks for checking this so quickly! What's the default setting for most users? I'm not a user, I'm developing accessible web content and am still struggling with testing my content with NVDA.

-- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/nvaccess/nvda/issues/12584#issuecomment-867496054

gregjozk commented 3 years ago

Hi,

@AliciaHof only advice, that I can give you, is, use "speech viewer" foun in NVDA -> tools. this will show on your screen, what has been sent to tts from nvda. sadly, how does tts interpret text, which was sent to it,mostly is not in control of NVDA. more and more often TTS engine are trying to be clever and then we see these things. in other words, on NVDA's side there is almost nothing, what developers can do, but the TTS makers can greatly improve their engines with upgraded inteligence or simply with partial removal of it e.g. speak abriviations and other words, as they are written. hope, it will help you.

regards, Jožef

AliciaHof commented 3 years ago

Thanks so much for all the responses! I checked the NVDA speech viewer and it sends the correct info.

Adriani90 commented 3 years ago

Given the NVDA sending the correct info, this issue seems to originate from the TTS engine and there is no way to fix it here. One could change the priorities in NVDA and try to not honor the synth pronounciation but this would break other things I guess. I suggest to contact the TTS developers to fix this. You can ofcourse reference to this issue.

For now i am closing this issue here, but if you have any further ideas / suggestions then we can reopen.