rycus86 / githooks

Githooks: per-repo and global Git hooks with version control
MIT License
380 stars 19 forks source link

Shell escaping issues #158

Open mjk-gh opened 3 years ago

mjk-gh commented 3 years ago

Using https://www.shellcheck.net/ to check for githooks' POSIX compliance (inspired by #157), shellcheck spat out a few warnings regarding backslashes in double-quoted strings for version 2103.272039-8ab988:


In githooks line 160:
            SHARED_LOCAL_REPOS_LIST=$(grep -E "^[^#\n\r ].*$" <"$(pwd)/.githooks/.shared")
                                                   ^-- SC1117: Backslash is literal in "\n". Prefer explicit escaping: "\\n".
                                                     ^-- SC1117: Backslash is literal in "\r". Prefer explicit escaping: "\\r".

In githooks line 602:
            SHARED_REPOS_LIST=$(grep -E "^[^#\n\r ].*$" <"$(pwd)/.githooks/.shared")
                                             ^-- SC1117: Backslash is literal in "\n". Prefer explicit escaping: "\\n".
                                               ^-- SC1117: Backslash is literal in "\r". Prefer explicit escaping: "\\r".

In githooks line 910:
    if echo "$1" | grep -qE "\-\-(shared|local|global)"; then
                             ^-- SC1117: Backslash is literal in "\-". Prefer explicit escaping: "\\-".
                               ^-- SC1117: Backslash is literal in "\-". Prefer explicit escaping: "\\-".

In githooks line 977:
    if echo "$1" | grep -qE "\-\-(shared|local|global)"; then
                             ^-- SC1117: Backslash is literal in "\-". Prefer explicit escaping: "\\-".
                               ^-- SC1117: Backslash is literal in "\-". Prefer explicit escaping: "\\-".

In githooks line 1034:
            if echo "$LINE" | grep -qE "^[^#\n\r ].*$"; then
                                            ^-- SC1117: Backslash is literal in "\n". Prefer explicit escaping: "\\n".
                                              ^-- SC1117: Backslash is literal in "\r". Prefer explicit escaping: "\\r".

In githooks line 1213:
            SHARED_REPOS_LIST=$(grep -E "^[^#\n\r ].*$" <"$(pwd)/.githooks/.shared")
                                             ^-- SC1117: Backslash is literal in "\n". Prefer explicit escaping: "\\n".
                                               ^-- SC1117: Backslash is literal in "\r". Prefer explicit escaping: "\\r".

In githooks line 1267:
        SHARED_HOOKS=$(grep -E "^[^#\n\r ].*$" <"$(pwd)/.githooks/.shared")
                                    ^-- SC1117: Backslash is literal in "\n". Prefer explicit escaping: "\\n".
                                      ^-- SC1117: Backslash is literal in "\r". Prefer explicit escaping: "\\r".

In githooks line 1305:
    if echo "$1" | grep -iEq "^\s*file://"; then
                               ^-- SC1117: Backslash is literal in "\s". Prefer explicit escaping: "\\s".

In githooks line 2336:
        if echo "$2" | grep -qE "\-\-(local)"; then
                                 ^-- SC1117: Backslash is literal in "\-". Prefer explicit escaping: "\\-".
                                   ^-- SC1117: Backslash is literal in "\-". Prefer explicit escaping: "\\-".

In githooks line 2347:
        if echo "$2" | grep -qE "\-\-(local)"; then
                                 ^-- SC1117: Backslash is literal in "\-". Prefer explicit escaping: "\\-".
                                   ^-- SC1117: Backslash is literal in "\-". Prefer explicit escaping: "\\-".

I just tested for the first error:

$ cat << EOF > unix
# This is a comment

once
upon
a
time

there
was
an empty
line
EOF
$ cp unix dos
$ unix2dos dos
$ posh   # optional, result is the same with bash
$ grep -E "^[^#\n\r ].*$" < unix
once
upon
a
time
there
was
an empty
line
$ grep -E "^[^#\n\r ].*$" < dos

once
upon
a
time

there
was
an empty
line
$

Seems legit. :-} Normal grep cannot match occurrences of '\n' anyway (except '$' for the end of the line), only with "--perl-regexp", but I doubt that this is POSIX compliant.

For above grep expressions in lines

a working portable/POSIX construct would be:

cat $FILE | tr -d '\r' | grep -v '^\(#\|$\)' 

(mind the single quotes!) or for UUoC award fappers:

tr -d '\r' < FILE | grep -v '^\(#\|$\)' 

(mind the single quotes!)

Or with "-E" to forgo the backslashes, just don't know how portable that option is.

If the OCD part in you gets annoyed by shellcheck's warnings for expressions linke

grep -qE "\-\-(shared|local|global)"

in lines

(which are technically correct, but useless, as shellcheck doesn't get the semantics), I would instead use

grep -qE '[-]-(shared|local|global)'

or just use the original expressions, but in single quotes.

Use of "stop option processing"-option -- would have been ideal, but is most likely not portable.

In general, I would use single quotes (´) instead of double quotes (") for anything other than pure [A-Z0-9_ ], except if I am sure that I want the shell to process characters with special meaning, like dollar ($), backtick (`), or backslash (\).

P.S.: Sorry for the humongous amount of text for such a teensy issue, I got carried away. :-}

rycus86 commented 3 years ago

Hm, that's odd, CI runs shellcheck here: https://github.com/rycus86/githooks/blob/master/tests/test-rules.sh#L28 which is configured to run shellcheck here: https://github.com/rycus86/githooks/blob/master/.githooks/pre-commit/shellcheck#L15

What settings did you run shellcheck with?

mjk-gh commented 3 years ago

OMG. I used my distro's version of spellcheck, which is much older (0.5) than the current release (>= 0.7).

Using a current binary of spellcheck, checking the githooks script reveals ... zero warnings.

But hey, at least it pointed us to an erroneous grep expression! :-}

rycus86 commented 3 years ago

I'm not super sure if it's actually erroneous? Avoiding single quotes was a design choice when it was still embedding other scripts in the install.sh file, though it might not be relevant anymore. Still, most of those greps would have been like that forever I think, I'd find it odd if they would be severely broken, but I could totally be wrong.

I might need to double-check check under what conditions and distros and whatnot would that cause problems - unless you know it? :)

mjk-gh commented 3 years ago

These expressions might not have caused any problems in the past, but they are erroneous in the sense that they do not do what they are intended to do.

grep -E "^[^#\n\r ].*$" obviously intends to also remove empty lines, which does not work if the file contains CRLFs:

$ grep -E "^[^#\n\r ].*$" < dos

once
upon
a
time

there
was
an empty
line
$

Piping that through od -ta reveals that the "empty" lines are not just LFs, but CRLFs instead:

$ grep -E "^[^#\n\r ].*$" < dos | od -ta
0000000  cr  nl   o   n   c   e  cr  nl   u   p   o   n  cr  nl   a  cr
0000020  nl   t   i   m   e  cr  nl  cr  nl   t   h   e   r   e  cr  nl
0000040   w   a   s  cr  nl   a   n  sp   e   m   p   t   y  cr  nl   l
0000060   i   n   e  cr  nl
0000065
$

While githooks itself might be able to deal with the left-over "empty" lines, quite a few unix tools have failed on CRLF input in the past (cannot give any examples, because as soon as I get these files and my wonderful scripts break, I either scold the author of the file, or, as I get older and wiser, I make my scripts more resilient by using tr -d '\r' :-))

double-check check

Good idea. (SCNR ;)

mjk-gh commented 3 years ago

And about single vs. double quotes, I would only replace double quotes with single quotes, if the double quotes are actually problematic. So far, the only problem we had were warnings by an outdated version of shellcheck. I think we can live with that. :)

gabyx commented 3 years ago

@mjk-gh : There is a new Githooks executable (written in Go compiled to native) which does not have these problems. https://github.com/gabyx/githooks. I won't maintain the Shell version anymore, because its implementation was a superb guiding stone but became unmaintainable and had so many subtle stupid corner edges IMO. Any contributions are highly welcome. @rycus86 will probably fix bugs and stuff in this version if he has time.