crate-ci / typos

Source code spell checker
Apache License 2.0
2.71k stars 112 forks source link

Regular expression typo false positive #642

Open pdostal opened 1 year ago

pdostal commented 1 year ago

Hello,

I've this case:

error: `ot` should be `to`, `of`, `or`
  --> ./lib/qam.pm:71:44
   |
71 |     if ($patch_status =~ /Status\s*:\s+[nN]ot\s[nN]eeded/) {
   |                                            ^^

Would that be possible to filter those out?

epage commented 1 year ago

Similar to #643, our main two routes are

epage commented 1 year ago

FYI #695 provides a new workaround for false positives

kdeldycke commented 7 months ago

To illustrate this issue with more test cases, here is another false positive of a regular expression encountered in a markdown document (as produced by typos-cli 1.20.8):

error: `ba` should be `by`, `be`
  --> ./content/2011/postgresql-commands.md:47:95
   |
47 |   $ psql --tuples-only --no-align -d database_id -c "SELECT id FROM res_users;" | sed ':a;N;$!ba;s/\n/ /g'
   |                                                                                               ^^
   |

In my case, the fix consisted in adding the following configuration to my pyproject.toml:

[tool.typos]
default.extend-ignore-identifiers-re = [
    "ba",
]
epage commented 7 months ago

Personally, I would recommend using default.extend-ignore-re to look for the pattern of your style of regexes and avoid checking them completely, rather than playing whack-a-mole with specific identifiers within a regex.

When handling identifiers, I would recommend to instead use

[default.extend-identifiers]
ba = "ba"
kdeldycke commented 7 months ago

Personally, I would recommend using default.extend-ignore-re

Ah yes, thanks @epage for the precision. It's better indeed. It just took me while to understand the different stages of tokenization process of typos and the influence of its parameters on that.

In the end I fine-tuned typos with the following config:

[tool.typos]
default.extend-ignore-re = [
    "!ba;",
]