Closed jsoref closed 10 months ago
I am evaluating adding your spell checking action to our workflows, so that PRs like this can be avoided in the future. Thanks!
@jsoref, this goes far beyond my knowledge about GitHub actions. I have added your file to my "tip" branch but I am getting 2453 unrecognized words. You apparently didn't need to add all these words ... what am I doing wrong?
@mwilck: cherry-pick https://github.com/jsoref/multipath-tools/commit/d03eaebda1e5210bc6aa86f0d3e0180d1bae45cb
Or, to answer your question: I had to configure away ~700 of the words (by excluding files, and masking parts of lines with patterns) and then add ~1800 of them. The good thing is that adding additional items to the expect file is fairly trivial, so as you go it isn't a big deal -- the workflow helps you maintain that file (and as well as excludes, and to a lesser extent patterns and the workflow itself).
Ok ... at first sight, your commit seems to add more patterns than actually needed. I need to think more about this.
@bmarzins, @cvaroqui: do you reckon we should add this GitHub action and the respective code (see @jsoref's last comment above) to our repo?
Yes, there are a couple of boilerplate patterns that aren't truly necessary, and candidates is mostly a bootstrapping thing, although it means that if a project gains new stuff that fits a pattern it can be automatically suggested. (Note that you accidentally tagged someone else in https://github.com/opensvc/multipath-tools/pull/79#issuecomment-1878900065)
If you're going to do this, you should probably replace @prerelease
with @v0.0.22
-- prerelease has some features that I'm still working on/actively changing (especially the block-ignore feature -- I don't think it's needed for this repository).
Thanks. I'll wait for the other maintainers' thoughts.
I'm pretty ambivalent about this. I'm sure it will catch spelling mistakes. But I assume it will mostly just catch things that need to get added to the expect file, which at least appears easy to update. I don't see any of the boilerplate patterns hurting anything. We could always just limit spell-checking to things like the man pages, README.md, and the public library .h files. This would cut down on the busy work but still make sure the stuff visible to non-multipath-developers is spell-checked.
You can use only.txt
to pick specific files to check (as opposed to using excludes.txt
). If you have questions, I should be able to respond fairly quickly -- and if you have feedback, please feel free to send it to me.
I won't have time to work on the GH integration short-term. I will put it on a long-term todo list. Thanks to Josh's efforts, quite a few speling errors have been eliminated for now ;-)
@jsoref, I've been playing around with this a bit, strongly restricting the set of files scanned for misspellings like @bmarzins suggested. I found that check-spelling recommends adding words like the following:
+Bfile
+Bgreedy
+Bhardware
+Ioff
+Ioverrides
+Ipath
These words originate from groff escape sequences in man pages, like \fBgreedy\fR
. Is there any chance to make check-spelling recognize these escape sequences, so that \fgreedy\fR
is treaded as greedy
and then checked? We use only \f
, \n
and ¸\c
, so no full support for groff syntax is required :smile:
Those sequences that matter most are \fB
, \fI
, \fP
, \fR
.
Yep, you can copy this pair of patterns: https://github.com/check-spelling/spell-check-this/blob/9b0b6efb73b068fb39f5bb764def90742f74aab3/.github/actions/spelling/candidate.patterns#L430-L434
If you have examples beyond that set, I'm happy to adjust the thing -- certainly the patterns there have done a fairly reasonable job in my testing.
Another one
"%N:%R:%n:%r"\fR. For example: 0x200100e08ba0aea0:0x210100e08ba0aea0:.*:.* , .*:.*:iqn.2009-10.com.redhat.msp.lab.ask-06:.*
Here check-spelling sees the word aea
. Can we make it recognize the entire WWNN/WWPN 0x200100e08ba0aea0:0x210100e08ba0aea0 as something that shouldn't be spell-cheked at all?
Right candidate-patterns should be cover that as well
Do I have to copy the entire candidate.patterns
file or can I just pick those patterns that are relevant for us?
Do I have to copy the entire candidate.patterns file
No.
or can I just pick those patterns that are relevant for us?
Yes.
candidate.patterns is just a thing that will automatically suggest things, you can just copy the items you want into patterns.txt
, or you can copy any portions of the file into your candidates file (merging anything you've already commented out) to get additional things it'll suggest when it runs across them.
And yes, candidates has at least one hex pattern (probably the hex/color one).
With https://github.com/openSUSE/multipath-tools/commit/391767f3a0d13a148f643810c5a01cb5fce08b98, I eventually got :green_circle: for this workflow :grin:
Patterns that might be interesting for you:
# WWNN/WWPN (NAA identifiers)
\b(?:0x)?10[0-9a-f]{14}\b
\b(?:0x|3)?[25][0-9a-f]{15}\b
\b(?:0x|3)?6[0-9a-f]{31}\b
# iSCSI iqn (approximate regex)
\biqn\.[0-9]{4}-[0-9]{2}(?:[\.-][a-z][a-z0-9]*)*\b
Thanks, added as https://github.com/check-spelling/spell-check-this/commit/ee04617f306053e1bf0bed66effc71c4e10da77f
Note that I tend to rewrite commits in @prerelease
(I generally don't rewrite commits in @main
unless I accidentally push... which happens when I'm working w/o sleep -- oops), so the commit sha / description may change.
--- I'll definitely be adjusting the WWNN/WWPN patterns -- probably just merging them into a single pattern as check-spelling uses the comment before a pattern when it suggests the pattern to hint to readers about its purpose.
This is a follow-up to https://github.com/opensvc/multipath-tools/pull/37
Fixes misspellings identified by the check-spelling action.
The misspellings have been reported at https://github.com/jsoref/multipath-tools/actions/runs/7405969189/attempts/1#summary-20149678198
The action will report that the changes in this PR would make it happy: https://github.com/jsoref/multipath-tools/actions/runs/7405969286/attempts/1#summary-20149678465