languagetool-org / languagetool

Style and Grammar Checker for 25+ Languages
https://languagetool.org
GNU Lesser General Public License v2.1
12.49k stars 1.4k forks source link

[pt] Fixing disambiguator: SPS00 to DA0MS0/DA0FS0 #6925

Open marcoagpinto opened 2 years ago

marcoagpinto commented 2 years ago

@jaumeortola

Hello!

For months that I have been struggling with this while developing rules and antipatterns, and I know I should have shared it earlier when you did the major changes to the disambiguator.

What I am facing is that sometimes “o” and “a” get converted to SPS00 which confuses the whole programming since SPS00 has no gender.

Here is an example, check this text in the morphologic dictionary site: o. a.

“a” becomes also “SPS00” which damages tons of rules.

This is a major revelation which can impact on tons of rules if fixed, I am even scared to think about it, but it is something that needs to be fixed.

What is your opinion, Jaume?

Thanks!

jaumeortola commented 2 years ago

I know that these words are ambiguous, and it can be difficult to fully disambiguate them.

Can you be more specific about the changes you need? Provide examples.

marcoagpinto commented 2 years ago

@jaumeortola

Right now, I can't remember of any example.

As I code antipatterns, I will try to find some.

marcoagpinto commented 2 years ago

@jaumeortola

Here is an example I found:

Quero que digas a verdade já!
Quero que digas as verdades já!

In the first sentence, “a” appears as “SPS00” and in the second as “DA0FP0”.