en-wl / wordlist

SCOWL (and friends).
http://wordlist.aspell.net
Other
393 stars 79 forks source link

Missing words (technical terms) #253

Closed ideasman42 closed 5 years ago

ideasman42 commented 5 years ago

These words I believe are correct spelling and should be part the default aspell-en package used on many systems.

kevina commented 5 years ago

I am leaving out performant as a google search revivals that many don't think it is a real word.

I added:

accessor completer completers enqueue enqueued enqueues prepend serializable unary variadic

parallelization parallelized

kevina commented 5 years ago

I also added gimble but to the large dictionary.

kevina commented 5 years ago

Thank you for the list.

ideasman42 commented 5 years ago

Any reasons these words weren't added?

kevina commented 5 years ago

I can't add every word, sorry. I don't recognize either polygonization and confusticate and it has a very low frequency in app.aspell.net/lookup-freq:

Word                 |  Adj. Freq   Newness Rank | Normal dict | Large dict
  similar words      |  (per million)            | should incl | should incl
---------------------|---------------------------|-------------|-------------
rasterize            |       0.0378  1.5  211763 |    **       |    ***  

polygonization       |       0.0256  0.5  266622 |    **       |    **   

confusticate         |       0.0004  1.3 3014616 |    *        |    *    
  confisticate       |      0.7x     1.4 3562287 |    *        |    *    

rasterize also doesn't seam very common.

ideasman42 commented 5 years ago

Shouldn't a dictionary contain all correct spellings (maybe with the exception of terms in esoteric domains). Similar to a large physical dictionary which generally contains words like this too.

Terms such as confusticate are used in novels (The Hobbit for example).

The thing is using words like this are correct spelling, but show up as spelled incorrectly, so I have to keep second guessing my own spelling or maintain my own custom dictionary.

Said differently, if the words are in popular online dictionaries, why not include them?

biljir commented 5 years ago

Honestly, I think the meanings of polygonization are esoteric meanings in esoteric domains. On the other hand, the fact that Lexico (alias Oxford) lists confusticate is a point in its favor. (Though I do suspect that, unless one is writing dialog, it is mostly used in spoken English, where spelling checkers do you no good whatsoever.)

As to your larger point, no dictionary contains every utterance (or letter sequence) that might be considered a word, and not every dictionary calls itself "unabridged". For instance, I've never seen a dictionary which includes all the reasonable words starting with un-, because there are so many of them, and lexicographers have better things to do with their time than worry about words whose meanings are crystal clear.

I don't think a word list for use in spell-checking can be compared to a physical dictionary, anyway. Their purposes are different. For instance, dictionary writers never need to worry about whether one word resembles some other more common word. Kevin has to be careful not to introduce uncommon words which might be suggested as replacements of a misspelling for some more common word. But a dictionary editor might well include both words, to illustrate the differences in their meaning. Both decisions would arguably be correct, based on the differing uses of their "product".