languagetool-org / english-pos-dict

English POS and dictionary data
2 stars 3 forks source link

initialisms #18

Closed jaumeortola closed 2 months ago

jaumeortola commented 3 months ago

We have many all upper case words (>9000), presumably acronyms, pending review. They are mostly untagged and come from the Hunspell dictionaries and spelling.txt. We can keep them in the dictionary this way:

JED=UNTAGGED=all

That will mean that they have no tag, and they won't be added to the tagger dictionary, but they will be in the speller dictionaries. alluppercase-words.txt

@AzadehSafakish @evan-defran-lt

AzadehSafakish commented 3 months ago

That will mean that they have no tag, and they won't be added to the tagger dictionary, but they will be in the speller dictionaries.

I've looked at the list, and this seems acceptable to me.

jaumeortola commented 2 months ago

https://github.com/languagetool-org/english-pos-dict/commit/dde146e3f3ff84403c02fc7a7371cd5b5811d478 https://github.com/languagetool-org/english-pos-dict/commit/902f9fdaa45068d305045e84269b7eb8da602f63