pawamoy / mkdocs-spellcheck

A spell checker plugin for MkDocs.
https://pawamoy.github.io/mkdocs-spellcheck/
ISC License
15 stars 3 forks source link

bug: Contractions are incorrectly reported as typos #22

Open nfelt14 opened 2 months ago

nfelt14 commented 2 months ago

Description of the bug

Basic contractions such as "Doesn't" or "couldn't" are being reported as typos because the checker is splitting the word on the single quote character.

To Reproduce

  1. Create a document with the word "doesn't" in it
  2. Build the docs with the symspellpy backend

Full traceback

WARNING -  mkdocs_spellcheck: (symspellpy) index.md: Misspelled 'doesn', did you mean 'does'?

Expected behavior

The contraction should not be flagged as a typo.

Environment information

$ python -m mkdocs_spellcheck.debug
- __System__: Windows-10-10.0.19045-SP0
- __Python__: cpython 3.11.8
- __Environment variables__:
- __Installed packages__:
  - `mkdocs-spellcheck` v1.0.3
pawamoy commented 2 months ago

Yep, I'm aware of this limitation. There's no easy solution though. Each language has its own peculiarities. We could rely on natural language processing libraries to correctly split text. Not sure how easy it would be :) Happy to see more suggestions and/or review PRs that address this!