en-wl / wordlist

SCOWL (and friends).
http://wordlist.aspell.net
Other
383 stars 77 forks source link

SI unit words #260

Open getsnoopy opened 4 years ago

getsnoopy commented 4 years ago

The SI unit word for the base unit "metre" and the non-SI unit word accepted for use with the SI "litre" should be included in the en_US list. I work at the US Metric Association, and can say both of these words are valid. In fact, they're the only words for those units that are valid, as unit words are not susceptible to the whims of English dialect evolution, but are controlled by the International Bureau of Weights and Measures (BIPM). And the BIPM spells those root units as "metre" and "litre". The "liter" and "meter" variants should be removed, and/or supplemented by the official "metre" and "litre" standards. This would mean not only including the words "metre" and "litre", but also their prefixed derivatives (e.g., "millimetre", "kilometre", "millilitre", "kilolitre", etc.).

Also, it seems like the dictionar(y/ies) in general don't have some prefixed derivatives for units, like the kilojoule, megametre, gigawatt, etc. Prefixes going up/down to "tera-" and "nano-" should be included for the most common units, if not all of them.

getsnoopy commented 4 years ago

Btw, I'm more than happy to take on the task of adding those words in.

kevina commented 4 years ago

I do not know any any American who spells meter as "metre", it is considered a British spelling of the word: https://www.merriam-webster.com/dictionary/metre. Just because it is accepted as a valid spelling by the US Metric Association, does not mean it should be included.

I might be willing to mark it is a variant so it is included if you use a dictionary which includes variants.

I am open to additional units that are missing for the most common units. Please check with http://app.aspell.net/lookup. But again only the US spelling will appear in the main word list.

@biljir what are your thoughts?

biljir commented 4 years ago

I tend to side with Kevin. I admit that having an international standard which specifies only the spelling "metre" seems like it should have some weight, but certainly excluding the spelling "meter" would be wrong (not least because that spelling has other meanings besides a unit of length). I think the fact that the standard spelling is not observed in American usage in general is a good reason not to include it. It is of course possible that there are certain contexts where the standard usage is observed, even by Americans, but I assume there is nothing a spelling checker can easily do to determine whether it is being used in such a context.

On Sat, Oct 12, 2019, at 6:48 PM, Kevin Atkinson wrote:

I do not know any any American who spells meter as "metre", it is considered a British spelling of the word: https://www.merriam-webster.com/dictionary/metre. Just because it is accepted as a valid spelling by the US Metric Association, does not mean it should be included.

I might be willing to mark it is a variant so it is included if you use a dictionary which includes variants.

I am open to additional units that are missing for the most common units. Please check with http://app.aspell.net/lookup. But again only the US spelling will appear in the main word list/

@biljir https://github.com/biljir what are your thoughts?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/en-wl/wordlist/issues/260?email_source=notifications&email_token=ACJK3ENZDTXB6JH5A53L7EDQOJH2HA5CNFSM4JAESWLKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBCJZNI#issuecomment-541367477, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACJK3ENWAFARGGRUR6SSS3LQOJH2HANCNFSM4JAESWLA.

getsnoopy commented 4 years ago

@kevina I know many who do. And it's accepted as a valid variant (it's actually not a "British" spelling but an international spelling; MW is normally quite biased, but in this case it's simply incorrect). But the point isn't what is and isn't accepted by the USMA, it's that it's the only valid variant published by the BIPM, which is the authority on these matters. In effect, "meter" (used to mean the fundamental unit of length in the SI) is incorrect. So at the very least, it should co-exist in the main dictionary (not the one with variants).

getsnoopy commented 4 years ago

@biljir Yes, of course; the word "meter" means something: a device that measures something. So we'd want things like "water meter", "gas meter", "thermometer", "speedometer" to validate. But "kilometer" and other prefixed variants of the SI unit shouldn't. Or at the very least, be considered co-equal variants of the standard.

kevina commented 4 years ago

Sorry, I as a general policy do not include variants in the main dictionary to promote consistent spelling. I can find very little evidence that "metre" is accept as the spelling for "meter" in US English. Most sources suggest otherwise: https://grammarist.com/spelling/meter-metre/, https://www.merriam-webster.com/dictionary/meter (definition 3).

getsnoopy commented 4 years ago

I understand, but this is an exceptional case. It is allowed by law, and there is plenty of evidence to show that they're acceptable for use:

getsnoopy commented 4 years ago

Any thoughts?

kevina commented 4 years ago

None of those sources convinced me that "metre" should be accepted in the standard American dictionary. At most they only say "metre" is acceptable, hence I consider it a variant. The US Metic Association website (usma.org) also uses the "-re" spelling in the few pages I checked.

I maintain variant information in VarCon. What I am willing to do is create a separate spelling category for the preferred spelling of metric units based on the way BIPM spells it, but uses standard American spelling elsewhere. We can call it 'M' category in varcon.txt. To use it you can create a custom dictionary from http://app.aspell.net/create.

Also, if there are some prefixed derivatives for units that are missing I will consider adding them if they are common enough.

getsnoopy commented 4 years ago

I want to underscore the fact that metric unit words (or any unit words, for that matter) are not regular dictionary words that are subject to dialectal variation (this is a common misconception), but are defined in internationally-agreed specifications, at least when it comes to French and English. There's a reason acre is spelled with the "-re" suffix even in American English, and the unit katal can't be spelled as catal, for example. The bias of publications like Merriam-Webster frustrates the standard use of these terms. Therefore, the "American spelling" of these terms are the variants, not the other way around.

If adding the correctly spelled terms to the main dictionary is still not possible, I welcome the creation of a new category for these words. However, could we have this variant dictionary also available in the set of files you release as part of the SourceForge project? Creating a custom dictionary every time to keep up with updates in the project is cumbersome.

As for other measuring words that are missing, here are the ones I'm proposing be added to all the dictionaries:

getsnoopy commented 4 years ago

Any progress on this front?

Meekohi commented 4 years ago

fwiw I agree with Kevin and am skeptical of the value of even adding these as variants... from wikipedia:

The SI symbols for the metric units are intended to be identical, regardless of the language used but unit names are ordinary nouns and use the character set and follow the grammatical rules of the language concerned. For example, the SI unit symbol for kilometre is "km" everywhere in the world, even though the local language word for the unit name may vary. Language variants for the kilometre unit name include: chilometro (Italian), Kilometer (German), kilometer (Dutch), kilomètre (French), χιλιόμετρο (Greek), quilómetro/quilômetro (Portuguese), kilómetro (Spanish) and километр (Russian).

The use of kilometer is multiple orders of magnitude greater than kilometre in the US geography: https://trends.google.com/trends/explore?geo=US&q=kilometer,kilometre

Alas, language is not decided upon by international committees. 🤷‍♂

getsnoopy commented 4 years ago

Agreed, which is why I qualified it with "at least when it comes to English and French", since the SI is published in both of those languages. It's the same as how the proper way to spell the elements' names in English are "aluminium" and "sulfur", not "aluminum" and "sulphur", respectively, because the IUPAC standardizes those. Words created from standards don't quite fall into the same category as "common" words, so international committees do actually decide upon them. And the usage thing is more of a chicken-and-egg problem: if the spell checker says that "kilometre" is incorrect, then people will very quickly be taught that it indeed is and switch over.

But regardless, I was referring more to the additional unit words being added to the dictionary.

kevina commented 4 years ago

I generally add words in batches, but will likely add most of the words you suggested. "radian" is already in the upstream dictionary.