codespell-project / codespell

check code for common misspellings
GNU General Public License v2.0
1.88k stars 468 forks source link

false positive: unparseable #3444

Closed polluks closed 4 months ago

polluks commented 4 months ago

https://github.com/codespell-project/codespell/blob/3760e619e2a7b7fa8467debb2889d21a34651853/codespell_lib/data/dictionary.txt#L58550 Wiktionary says since 2008 https://en.wiktionary.org/wiki/unparseable

DimitriPapadopoulos commented 4 months ago

Yes, the difficulty is that I cannot find that word in most mainstream curated dictionaries, either as parsable or as parseable:

However, OED does have it, as parsable only:

I usually suggest we follow SCOWL (And Friends), as the mainstream open source spelling database, but unfortunately it lacks both parsable and parseable: http://app.aspell.net/lookup?dict=en_US-large;words=unparsable http://app.aspell.net/lookup?dict=en_US-large;words=unparseable On the other hand, it has both writable and writeable.

The commercial spellchecker I have tried does suggest to replace parseable with parseable.

Searching parsable vs parseable comes up with pages mostly (but not always) suggesting to use parsable instead of parseable, for example:

Grammar books suggest that the final silent e of the root word is dropped, except if the root word ends in ce or ge. That usually remains a suggestion, but I feel parsable should be preferred in formal usage.

The Google Ngram Viewer shows parsable remains more common than parseable, but the difference is not significant enough to rule out parseable: https://books.google.com/ngrams/graph?content=unparsable%2Cunparseable&year_start=1800

In any case, if we are to remove this entry, we should review other occurrences of -eable for consistency:

$ grep 'eable->' codespell_lib/data/dictionary.txt | wc -l
87
$ 
DimitriPapadopoulos commented 4 months ago

By the way, this entry suggests that writeable is the British spelling while writable is the American spelling:

$ grep 'eable->' codespell_lib/data/dictionary_en-GB_to_en-US.txt 
writeable->writable
$ 

Total nonsense: https://books.google.com/ngrams/graph?content=writable%2Cwriteable&year_start=1900&year_end=2019&corpus=en-GB-2019 https://books.google.com/ngrams/graph?content=writable%2Cwriteable&year_start=1900&year_end=2019&corpus=en-US-2019

See https://github.com/DimitriPapadopoulos/codespell/commit/5f792d8b37781f75645a70115a9d21f2ce70cb7d.

DimitriPapadopoulos commented 4 months ago

Time permitting, I would like to review and search in SCOWL the 89 -eable entries found in codespell dictionaries. If SCOWL usually accepts both variants, I am happy to discard all such entries. Help welcome!

-eable entries in codespell dictionaries ```console $ grep 'eable->' codespell_lib/data/dictionary*.txt codespell_lib/data/dictionary_code.txt:cloneable->clonable codespell_lib/data/dictionary.txt:acchieveable->achievable codespell_lib/data/dictionary.txt:achiveable->achievable codespell_lib/data/dictionary.txt:adviseable->advisable codespell_lib/data/dictionary.txt:aggreeable->agreeable codespell_lib/data/dictionary.txt:agreable->agreeable codespell_lib/data/dictionary.txt:agreeeable->agreeable codespell_lib/data/dictionary.txt:arrangteable->arrangeable codespell_lib/data/dictionary.txt:availeable->available codespell_lib/data/dictionary.txt:beable->be able codespell_lib/data/dictionary.txt:beliveable->believable codespell_lib/data/dictionary.txt:browseable->browsable codespell_lib/data/dictionary.txt:centrifugeable->centrifugable codespell_lib/data/dictionary.txt:compareable->comparable codespell_lib/data/dictionary.txt:compileable->compilable codespell_lib/data/dictionary.txt:configureable->configurable codespell_lib/data/dictionary.txt:createable->creatable codespell_lib/data/dictionary.txt:customizeable->customizable codespell_lib/data/dictionary.txt:debateable->debatable codespell_lib/data/dictionary.txt:decideable->decidable codespell_lib/data/dictionary.txt:defineable->definable codespell_lib/data/dictionary.txt:deleteable->deletable codespell_lib/data/dictionary.txt:derageable->dirigible codespell_lib/data/dictionary.txt:desireable->desirable codespell_lib/data/dictionary.txt:deviceremoveable->deviceremovable codespell_lib/data/dictionary.txt:eforceable->enforceable codespell_lib/data/dictionary.txt:eneable->enable codespell_lib/data/dictionary.txt:escapeable->escapable codespell_lib/data/dictionary.txt:executeable->executable codespell_lib/data/dictionary.txt:forseeable->foreseeable codespell_lib/data/dictionary.txt:indefineable->undefinable codespell_lib/data/dictionary.txt:interopeable->interoperable codespell_lib/data/dictionary.txt:joineable->joinable codespell_lib/data/dictionary.txt:knoledgeable->knowledgeable codespell_lib/data/dictionary.txt:knowladgeable->knowledgeable codespell_lib/data/dictionary.txt:knowlageable->knowledgeable codespell_lib/data/dictionary.txt:knowlegdeable->knowledgeable codespell_lib/data/dictionary.txt:knowlegeable->knowledgeable codespell_lib/data/dictionary.txt:knownledgeable->knowledgeable codespell_lib/data/dictionary.txt:knwoledgeable->knowledgeable codespell_lib/data/dictionary.txt:konwledgeable->knowledgeable codespell_lib/data/dictionary.txt:kowledgeable->knowledgeable codespell_lib/data/dictionary.txt:kwnoledgeable->knowledgeable codespell_lib/data/dictionary.txt:kwoledgeable->knowledgeable codespell_lib/data/dictionary.txt:measureable->measurable codespell_lib/data/dictionary.txt:migrateable->migratable codespell_lib/data/dictionary.txt:noteable->notable codespell_lib/data/dictionary.txt:overrideable->overridable codespell_lib/data/dictionary.txt:overwriteable->overwritable codespell_lib/data/dictionary.txt:prefereable->preferable codespell_lib/data/dictionary.txt:re-useable->reusable codespell_lib/data/dictionary.txt:reacheable->reachable codespell_lib/data/dictionary.txt:readeable->readable codespell_lib/data/dictionary.txt:rearrangteable->rearrangeable codespell_lib/data/dictionary.txt:recognizeable->recognizable codespell_lib/data/dictionary.txt:redeable->readable codespell_lib/data/dictionary.txt:redistributeable->redistributable codespell_lib/data/dictionary.txt:reduceable->reducible codespell_lib/data/dictionary.txt:relocateable->relocatable codespell_lib/data/dictionary.txt:removeable->removable codespell_lib/data/dictionary.txt:restoreable->restorable codespell_lib/data/dictionary.txt:reuseable->reusable codespell_lib/data/dictionary.txt:rotateable->rotatable codespell_lib/data/dictionary.txt:scaleable->scalable codespell_lib/data/dictionary.txt:searcheable->searchable codespell_lib/data/dictionary.txt:solveable->solvable codespell_lib/data/dictionary.txt:storeable->storable codespell_lib/data/dictionary.txt:suiteable->suitable codespell_lib/data/dictionary.txt:suposeable->supposable codespell_lib/data/dictionary.txt:supposeable->supposable codespell_lib/data/dictionary.txt:unbeliveable->unbelievable codespell_lib/data/dictionary.txt:undecideable->undecidable codespell_lib/data/dictionary.txt:undesireable->undesirable codespell_lib/data/dictionary.txt:uneforceable->unenforceable codespell_lib/data/dictionary.txt:unforgiveable->unforgivable codespell_lib/data/dictionary.txt:ungeneralizeable->ungeneralizable codespell_lib/data/dictionary.txt:unoticeable->unnoticeable codespell_lib/data/dictionary.txt:unparseable->unparsable codespell_lib/data/dictionary.txt:unreacheable->unreachable codespell_lib/data/dictionary.txt:untranslateable->untranslatable codespell_lib/data/dictionary.txt:unuseable->unusable codespell_lib/data/dictionary.txt:uptadeable->updatable codespell_lib/data/dictionary.txt:useable->usable codespell_lib/data/dictionary.txt:valueable->valuable codespell_lib/data/dictionary.txt:varieable->variable codespell_lib/data/dictionary.txt:visibleable->visible codespell_lib/data/dictionary.txt:vulneable->vulnerable codespell_lib/data/dictionary.txt:vulnreable->vulnerable codespell_lib/data/dictionary.txt:writeable->writable $ ```
polluks commented 4 months ago

Please take a look https://english.stackexchange.com/a/623194/273648

DimitriPapadopoulos commented 4 months ago

Do you suggest we keep the dictionary as is?

The above is a good summary of what I have been documenting previously. However, many native speakers don't like or do not blindly follow grammar rules: they insist on allowing alternative spellings, even when targeting a wider international public.

polluks commented 4 months ago

ok