davidacm / NVDA-IBMTTS-Driver

This project is aimed at developing and maintaining the NVDA IBMTTS driver. IBMTTS is a synthesizer similar to Eloquence. Please send your ideas and contributions here!
GNU General Public License v2.0
56 stars 23 forks source link

weird expanding of abbreviations #76

Open ns-studios opened 2 years ago

ns-studios commented 2 years ago

I'm not sure if anything can be done about this, but at least with ECI 6.1, it seems to expand certain abbreviations even if I have phrase abbreviation dictionary off. Examples include:

dollar-based:

A$ $A C$ $C S$ $S $5.50 C$5.50 $C5.50 A$5.50 $A5.50 S$5.50 $S5.50

digit followed by a lowercase abbreviation:

0-9mgt 0-9mtn 0-9mts 0-9km 0-9dm 0-9cm 0-9mm 0-9dl 0-9ml 0-9mg 0-9mi 0-9h 0-9hr 0-9min 0-9yr 0-9hp

Conclusion

None of those would be a problem necessarily if they didn't parse no matter where they are, so you could have a registration key or something that could read something like 9dm6yr , or even a variable like $CStr="test". I wonder if I'm just using the wrong version of libraries, or if there's a setting I'm missing? Jaws does not seem to do this for whatever reason.

davidacm commented 2 years ago

I did not notice it, because my main language is spanish. For english I use another synth. This issue happens on american english, UK english, and korean. So, I think a regexp replacement would fix it. Currently I have no time because I'm very busy on some personal projects, but I will solve this if no one fix it before.

Mohamed00 commented 2 years ago

In my opinion, this really isn't something the add-on should be trying to do things about. This behavior is coming from ECI itself, and I disagree with altering it in the add-on, especially the $5.50 example, as I feel that forcefully disabling things like this is forcing a personal preference on users. For your dollar examples, you should probably not pass the dollar sign to the synthesizer.

davidacm commented 2 years ago

This could be an option in the settings. I see it as a problematic situation when reading coding, E.G. programming, writing LateX, Lilypond, etc.

Mohamed00 commented 2 years ago

Try adding these expressions to your NVDA dictionary. Pattern: (\d+)([a-z]+) Replacement: \1 \2 Type: regular expression Pattern: (\d+)(yr) Replacement: \1 YR Type: regular expression

ns-studios commented 2 years ago

Thank you. I had something similar in place, but was wondering if it was perhaps an internal synth toggle or something. Can't really say I understand why they did it like that.

Mohamed00 commented 2 years ago

Yeah they made some... very weird decisions. You can turn off most of the abbreviation handling with the option, but some things can't be disabled that way. IBM almost got this right by allowing most of their currency abbreviations to be disabled, but they didn't allow that for everything, and those binaries have weirdness of their own.

Neurrone commented 1 year ago

This has been driving me crazy. I noticed that this only happens in Firefox for some reason, haven't tested with other web browsers. If I paste text like "Deceiving Defender: The Big Stack Bypass" into notepad, the dec in deceiving does not get read as december.