Closed gaborcsardi closed 5 years ago
Unfortunately, the apostrophe (fancy or not) is explicitly not ignored in English hunspell dictionaries because it is needed for words like it's
or let's
:
hunspell::hunspell("It’s a beautiful day")
It is actually the only custom wordchar
allowed in English words (everything else gets ignored):
> hunspell::dictionary('en_US')
<hunspell dictionary>
affix: /Library/Frameworks/R.framework/Versions/3.6/Resources/library/hunspell/dict/en_US.aff
dictionary: /Library/Frameworks/R.framework/Versions/3.6/Resources/library/hunspell/dict/en_US.dic
encoding: UTF-8
wordchars: ’
added: 0 custom words
So there is no easy way to do this in hunspell but perhaps we manuall strip them in spelling when parsing the AST as we do for the heading identifiers:
OK, that makes sense. Apparently Unicode prefers to use \u2019
for apostrophe, and pandoc uses it for quoted words, so this might be a pandoc error. Although I am not sure what pandoc could use instead. Maybe I should just ignore README.md
for spellchecking, it is a generated file, anyway.
I have this in an Rmd:
which will have fancy quotes in the
README.md
, courtesy of pandoc I guess:and then spellcheck reports:
I wonder if it would be easy to ignore the fancy quotes? TBH I am not sure why they are considered to be part of the word.