languagetool-org / english-pos-dict

English POS and dictionary data
2 stars 3 forks source link

Establish authoritative sources for each variant (or at least US and GB) #9

Open jaumeortola opened 7 months ago

jaumeortola commented 4 months ago

This table is useful to understand the main differences between variants: https://en.wikipedia.org/wiki/Oxford_spelling#Language_tag_comparison

Hunspell GB dictionary includes both Oxford and British spellings.

More lists of differences to check: https://blog.collinsdictionary.com/language-lovers/9-spelling-differences-between-british-and-american-english/ https://www.examenglishforfree.com/the-differences-between-british-english-and-american-english-spelling/

AzadehSafakish commented 4 months ago

Might be useful: Here is the full us2uk list I put together for this issue. I think as a general rule: Anything that's a US spelling can be applied to CA (unless the entry specifically says otherwise), anything that's a GB spelling can applied to AU, NZ, ZA (unless the entry specifically says otherwise). us2uk_expanded.json

jaumeortola commented 4 months ago

Anything that's a US spelling can be applied to CA

Is this general enough? According to the Wikipedia table, en-CA matches en-US in organization, realize, aging, analyze, but not in defence, licence, traveller.

AzadehSafakish commented 3 months ago

Is this general enough?

No, you're right, it's too general. Canadian English is a bit more of a mash-up. Wikipedia is the only "nice" table I can find at the moment; other sources are far less concise.

Editing Canadian English (Read sample > 3/4 of the way to the bottom: '3.5 Variant spellings by category') also has a few tables, but they more or less seem to agree with Wikipedia.

The Wikipedia table might be the simplest and quickest solution for CA, unless anyone can find an easy online resource for the Canadian Oxford Dictionary (that we can get a word list from).