Closed tvquizphd closed 3 years ago
Along a similar vein, these misspellings of Tinpot
result in an unexpected capitalized "O":
1:1-1:6 warning `Tinpb` is misspelt; did you mean `TinpOt` ... tinpb retext-spell
1:1-1:6 warning `Tinpc` is misspelt; did you mean `TinpOt` ... tinpc retext-spell
1:1-1:6 warning `Tinpd` is misspelt; did you mean `TinpOt` ... tinpd retext-spell
1:1-1:6 warning `Tinpf` is misspelt; did you mean `TinpOt` ... tinpf retext-spell
1:1-1:6 warning `Tinph` is misspelt; did you mean `TinpOt` ... tinph retext-spell
1:1-1:6 warning `Tinpj` is misspelt; did you mean `TinpOt` ... tinpj retext-spell
1:1-1:6 warning `Tinpl` is misspelt; did you mean `TinpOt` ... tinpl retext-spell
1:1-1:6 warning `Tinpm` is misspelt; did you mean `TinpOt` ... tinpm retext-spell
1:1-1:6 warning `Tinpq` is misspelt; did you mean `TinpOt` ... tinpq retext-spell
1:1-1:6 warning `Tinpv` is misspelt; did you mean `TinpOt` ... tinpv retext-spell
1:1-1:6 warning `Tinpx` is misspelt; did you mean `TinpOt` ... tinpx retext-spell
On the other hand, these misspellings of Tinpot
suggest the correct capitalization:
1:1-1:6 warning `Tinpo` is misspelt; did you mean `Tinpot`? tinpo retext-spell
1:1-1:6 warning `Tinpp` is misspelt; did you mean `Tinpot` ... tinpp retext-spell
1:1-1:6 warning `Tinpu` is misspelt; did you mean `Tinpot` ... tinpu retext-spell
1:1-1:6 warning `Tinpz` is misspelt; did you mean `Tinpot` ... tinpz retext-spell
It seems this problem is broader than initially recognized. I've discovered 190 additional 5-letter-words suggested by retext-spell
that include single unexpected capital letters.
I've included a json file in the gist with one key per result returned by retext-spell
with a single unexpected capital letter. Each key lists the misspellings to produce the key. Each misspelling derives from replacing the middle character in a 5-letter dictionary word.
I've counted 39 unexpected uppercase "R"'s, 36 unexpected "C"'s, 34 unexpected "B"'s, 27 unexpected "T"'s, 15 unexpected "E"'s, and ten or fewer unexpected "M"'s, "V"'s, "O"'s, "W"'s, "Y"'s, "N"'s, "P"'s or "U"'s.
TLDR: see nspell issue 37 and nspell issue 41.
Closing as the nspell PRs are released, and I’m assuming they fixed this!
TLDR: see PR 38 and PR 39 that I've opened against
nspell
.Subject of the issue
Note: I've changed the name of this issue from "Mysterious capital E returned for misspelled 5-letter nouns with a single capital T" to "Unexpected capital letters returned for certain capitalized misspellings," and I've edited this post slightly to reflect the broader scope.
Background
I had originally found this error for capitalized variants of 16 dictionary-en words: "tepee", "thane", "thole", "three", "throe", "tilde", "tinge", "tonne", "toque", "tribe", "trike", "trope", "trove", "truce", "tuque", and "twine". To give one notable example, any misspelling matching this RegEx
/^Thre[f-ln-racuvxyz]$/
is corrected to "ThreE" instead of "Three."Edit The below algorithm produces misspellings of the original 16 dictionary words, but I have since found 190 additional 5 letter words that occasionally occur in
retext-spell
vfile
messages with extraneous capital letters. I have saved these new words and the list of misspellings needed to generate them in a json file bundled with the gist for this issue.Generating examples
The gist to reproduce this issue tests misspellings generated as such:
dictionary-en
word starting with "T" and ending in "e"/MS
inindex.dic
Torte
(due to the second "t")If the misspellings do not match a different dictionary word more closely than the originally selected 5-letter word, then the first "expected" value in the
vfile
message emitted byretext-spell
will be the originally selected 5-letter word with final "e" mistakenly capitalized as "E".Edit Without getting into the details of
nspell
's keyboard groups, there is no easy way to generate the 190 newly discovered 5-letter words that do not match the misspellings generated with the above method.Your environment
Steps to reproduce
I've created a gist.
Execute the following commands to download the gist and install dependencies:
Run one of the following commands to test with various suffixes:
npm run test "*"
(for no suffix)npm run test "*s"
(for the plural)npm run test "*'s"
(for the possessive)Side note In contrast to the examples that produce the bug defined in this issue, you can run
npm run test "t"
,npm run test "ts"
, andnpm run test "t's"
to see the results of misspellings that fail to produce the bug due to the presence of a lowercase "t" in the misspelling.Expected behavior
All the logged
vfile
message reasons should show suggested values without unusual capitalization. The hundreds of misspellings tested withnpm run test "*"
,npm run test "*s"
, andnpm run test "*'s"
should generate suggested values with lowercase "e" characters. For example, the first tested misspellingTepea
should generate a top suggested value of "Tepee". The pluralTepeas
should generate a top suggested value of "Tepees". The possessiveTepea's
should generate a top suggested value of "Tepee's".Actual behavior
The hundreds of misspellings tested with
npm run test "*"
,npm run test "*s"
, andnpm run test "*'s"
all generate suggested values with uppercase "E" characters. For example, the first tested misspellingTepea
generates a top suggested value of "TepeE". The pluralTepeas
generates a top suggested value of "TepeEs". The possessiveTepea's
generates a top suggested value of "TepeE's".