tbroadley / spellchecker-cli

A command-line tool for spellchecking files.
MIT License
119 stars 16 forks source link

Help ignoring html tags like <li> <lo> <ul> #46

Closed jimmycasey closed 4 years ago

jimmycasey commented 4 years ago

Hi folks, Is it possible to ignore all html tags. I have tried various syntax but no luck.

  590  spellchecker --files '**/*.xml' -i ".*<.*>.*"
  591  spellchecker --files '**/*.xml' -i <.*>
  592  spellchecker --files '**/*.xml' -i '<.*>'
  593  spellchecker --files '**/*.xml' -i '\<.*\>'
  594  spellchecker --files '**/*.xml' -i '/<.*>/'

Still getting below results:

         5:6-5:8  warning  `ul` is misspelt; did you mean `kl`, `ml`, `ult`, `UL`, `bl`, `cl`, `fl`, `l`, `ll`, `pl`, `U`, `Uh`, `Um`, `Up`, `Ur`, `Us`, `Ut`, `Al`, `IL`, `JUL`, `LU`, `Tl`, `UK`, `UN`, `URL`, `UV`, `XL`?                                                                                                                                                         retext-spell  retext-spell
         6:8-6:10  warning  `li` is misspelt; did you mean `lii`, `mi`, `oi`, `lee`, `Li`, `bi`, `hi`, `i`, `ii`, `lib`, `lid`, `lie`, `lip`, `liq`, `lit`, `lix`, `pi`, `ti`, `vi`, `xi`, `L`, `La`, `Lb`, `Le`, `Lg`, `Lin`, `Liz`, `Ll`, `Ln`, `Lo`, `Lr`, `Ls`, `Lt`, `Lu`, `lei`, `lvi`, `lxi`, `AI`, `ALI`, `Ci`, `Di`, `ELI`, `GI`, `IL`, `LC`, `LP`, `Ni`, `RI`, `Si`, `WI`?  retext-spell  retext-spell

Thanks

tbroadley commented 4 years ago

Hey, sorry for the delayed response. It looks like the retext-spell spellchecker (which this project uses to detect spelling errors) is ignoring the angle brackets around the HTML tag names. For example, it's flagging ul as misspelled, not <ul>, so a regular expression like <.*> can't be used to ignore these cases. Instead, I would suggest using a personal dictionary to ignore HTML tag names.