languagetool-org / portuguese-pos-dict

Portuguese POS tagger
GNU Lesser General Public License v2.1
5 stars 2 forks source link

"email" with no hyphen is also correct #27

Closed dserodio closed 4 months ago

dserodio commented 4 months ago

"e-mail" can also be written with no hyphen, as you can see by searching for email site:uol.com.bror email site:folha.com.br, for instance, and it's also recognized by Priberam: https://dicionario.priberam.org/email

However, LanguageTool flags "e-mail" as an error.

I'd love to submit a PR to fix this if someone can point me in the right direction.

Thanks!

p-goulart commented 4 months ago

@dserodio, indeed, the unhyphenated variant is widespread in usage and accepted by a minority of dictionaries. That being said, the Brazilian Academy of Letters has adjuticated in favour of e-mail. The Vocabulário Ortográfico da Língua Portuguese continues to recommend only the form with the hyphen. From a strictly prescriptive point of view, the form email is non-standard.

We believe our speller should show users word forms that are to be considered correct in as many situations as possible. It is perfectly conceivable, in some contexts, that someone writing email will be told off for not adhering to the orthographic strictures imposed by the Academy.


That being said, I am personally also not very fond of this rule for a number of reasons.

@susanaboatto what do you think of accepting both hyphenated and unhyphenated forms in the base dictionary and having a picky/formal rule that enforces e-mail? That way most everyday writing shouldn't be affected, but we'd still have a safeguard for stuffier registers.

dserodio commented 4 months ago

accepting both hyphenated and unhyphenated forms in the base dictionary and having a picky/formal rule that enforces e-mail? That way most everyday writing shouldn't be affected, but we'd still have a safeguard for stuffier registers.

This seems like a great solution

susanaboatto commented 4 months ago

@susanaboatto what do you think of accepting both hyphenated and unhyphenated forms in the base dictionary and having a picky/formal rule that enforces e-mail? That way most everyday writing shouldn't be affected, but we'd still have a safeguard for stuffier registers.

I agree this should be the way to go.

p-goulart commented 4 months ago

@susanaboatto what should the specs for the rule be?

I'm thinking formal+academic TTs but picky=false. Does that make sense? Or maybe given the limited reach of the WGs for some LT clients, maybe picky=on?

susanaboatto commented 4 months ago

It would probably be best to make it picky in the grammar.xml file. I assume many people writing in a professional, but laid-back setting would appreciate having this flagged. To be honest, the ideal solution would be a user option, but I suppose this is not on the books for the near future - so picky it is.