crate-ci / typos

Source code spell checker
Apache License 2.0
2.63k stars 106 forks source link

[whishlist] config/cli option to specify words min. length #1079

Open h-mathias opened 2 months ago

h-mathias commented 2 months ago

It can currently be solved with adding ^[a-zA-Z]{1,3}$ to extend-ignore-words-re but it is less efficient than just comparing the length of the word.

epage commented 2 months ago

If we did this, it would be config only. We intentionally limit what capabilities we provide on the CLI as configuration like this is intended to be project configuration rather than run configuration.

Could you go into more detail on what problem you are running into with short words that you don't just need a short-word filter but you need it more efficient than what extend-ignore-words-re provides?

h-mathias commented 2 months ago

It is mainly about words with 2 or 3 characters which are often abbreviations or acronyms. Some false positives findings ba, fo, seh, mis, ue, nd. cspell for examples has minWordLength. It would be a convenience option but as said it also works with extend-ignore-words-re so feel free to close this issue.

epage commented 2 months ago

From cspell's docs

minWordLength - defaults to 4 - the minimum length of a word before it is checked.

Not quickly finding when that was introduced to see all of the motivation. Unsure what it is about the code bases I work on that this hasn't really been a problem which biases me towards leaving this to users via regex.

cmdcolin commented 1 month ago

random personal feeling: i generally find longer words have much more signal to noise

rather than a threshold, i have considered making a user interface that consumes the output of typos-cli to sort it long to short, and even make a TUI to approve a fix or "add to ignorelist". just dreaming though:)