codespell-project / codespell

check code for common misspellings
GNU General Public License v2.0
1.93k stars 466 forks source link

Bug? `ignore-words-list` seems to have no effect #2375

Open joshgoebel opened 2 years ago

joshgoebel commented 2 years ago

Am I doing something wrong? How can I get it to stop flagging MSDOS as an error? ignore-words-list seems to have no effect

> codespell
./src/keyszer/models/key.py:161: MSDOS ==> MS-DOS

My config:

[codespell]
ignore-words-list = MSDOS
skip = .venv,.git,build,keyszer.egg-info
peternewman commented 2 years ago

Like so:

[codespell]
ignore-words-list = msdos
skip = .venv,.git,build,keyszer.egg-info

Which admittedly doesn't match our docs: https://github.com/codespell-project/codespell/#usage

Based on: https://github.com/codespell-project/codespell/blob/978a22315238749224c5659b95f5fab145252a0f/codespell_lib/data/dictionary.txt#L20494

joshgoebel commented 2 years ago

That's it - fixes the issue, but yeah the way i read the docs was that I should try to match case...

Perhaps we need a docfix?

peternewman commented 2 years ago

https://github.com/codespell-project/codespell/blob/1d9dee38735a173b4827b67a4f13987290a03475/codespell_lib/_codespell.py#L456-L462

I think the issue may be that MSDOS is one of the few entries which is upper case in the dictionary, hence flagging this issue.

mwestphal commented 2 years ago

Looks like there is a case issue:

nNumber ==> number
ignore-words-list=nNumber

Not excluded

ignore-words-list=nnumber

excluded successfully.

peternewman commented 2 years ago

Looks like there is a case issue:

Yeah our comparison to match the dictionary from the ignore list is happening with different case requirements to what we claim in the docs.

yarikoptic commented 1 year ago

FWIW just ran into this issue and was about to complain as well -- would be great to have it fixed, as to conform to the promise of being case sensitive. My use case:

❯ codespell --help | grep -A3 -e ' -L'
  -L WORDS, --ignore-words-list WORDS
                        comma separated list of words to be ignored by
                        codespell. Words are case sensitive based on how they
                        are written in the dictionary file
❯ cat spellme.txt
Ingress for Microsoft Azure Kubernetes Service (AKS)
❯ codespell -L AKS spellme.txt
spellme.txt:1: AKS ==> ASK
❯ codespell --version
2.2.2

so I would dislike completely avoiding fixing aks into ask ;)

iamibi commented 1 year ago

I am facing a similar issue mentioned here. When using ignore-words-list as part of pyproject.toml to ignore "OT" as a words, it is not respecting it. full configuration for codespell

[tool.codespell]
quiet-level = 0
skip = "poetry.lock"
ignore-words-list = "OT"

Any suggestions?

yarikoptic commented 1 year ago

I some times add those into ignore-regex instead with end of the word boundary, like ignore-regex = "\bOT\b" if I want to retain regular to to be a typo to be detected.

DimitriPapadopoulos commented 1 year ago

@iamibi As already explained:

ignore-words-list = "ot"
yarikoptic commented 1 year ago

that would also skip lower case ot and I thought that the goal here is to just skip upper cased OT only.

perillo commented 11 months ago

I'm having a similar problem when checking Zig source files. codespell refuses to ignore ACCES and WRONLY.

DimitriPapadopoulos commented 11 months ago

@perillo What's your question ?

perillo commented 11 months ago

@perillo What's your question ?

Looking at the comments, I assumed that using lowercase for ignore words was a temporary fix. However https://github.com/codespell-project/codespell/#ignoring-words documents that this is the expected behavior.

Thanks.

nkozheninEvinced commented 11 months ago

I had an issue with pre-commit, it removed unstaged changes from pyproject and only after that run codespell, and I had some changes introduced there. Try doing git add pyproject.toml and run pre-commit pre-commit run -a

DimitriPapadopoulos commented 11 months ago

@nkozheninEvinced How is your problem related to this issue?

MercuryDemo commented 10 months ago

I see that this issue has not been closed yet, so even now, codeshell still only supports lowercase ignore-words and cannot ignore uppercase letters?