RainbowHackerHorse / vzvol

vzvol is a general use ZFS zvol management tool, that handles creation, destruction, listing, and formatting with various FSes, in an easy to use single program
BSD 2-Clause "Simplified" License
31 stars 5 forks source link

Shell scripts contain errors #11

Closed yurivict closed 6 years ago

yurivict commented 6 years ago

Hello RainbowHackerHorse,

Thank you for submitting the FreeBSD port.

Shell scrips in this project contains a considerable amount of errors. Please use shellcheck to check your scripts. ex: for ss in `find . -name "*.sh"`; do shellcheck $ss; done;

All scripts that have the shebang #!/bin/sh are expected to be Bourne shell scripts.

Could you please correct the errors and create a release here? No need to update the FreeBSD patch. I will take it from here.

Thanks! Yuri

RainbowHackerHorse commented 6 years ago

@yurivict Yuri, Thanks for opening this issue! I appreciate your input, however:

I can assure you, shellcheck is wrong on most of these warnings or errors, and the ones it is not, I have ignored because they do not apply, or are mitigated elsewhere in the code.

For a more accurate readout, excluding shellcheck messages that either don't apply to our version of sh, or that I address here (like shellcheck not knowing how to look for functions and variables set in sourced files, even though those work fine during execution of the program), run: for ss in `find . -name "*.sh"`; do shellcheck -e SC2034,SC2039,SC2166,SC2181,SC2153,SC2143,SC2154,SC2010 $ss; done;

I actually run shellcheck live during my scripting, via SublimeLinter, in SublimeText3 on my FreeBSD 11.1-RELEASE laptop :) (Go inuxulator!) I'm aware of the warnings and errors as soon as they present themselves, and have made decisions accordingly.

I'd also like to note that during every stage of development, and before release, I extensively tested all the functions here, including running through the script with set -x and set -e on, ensuring that all edgecases i found, and all bugs I caught, were fixed after testing. This is a process I repeat now for every release, including point releases.

I invite you to manually test my script yourself, perhaps within a VM for safety, and see that it does indeed function as it should, under the default shell in FreeBSD. If you do find an error while testing, please don't hesitate to open a new issue, and it'll be fixed ASAP! :) I'll leave this issue open for now in case you'd like to respond to what I've said above. I'd also like to thank you for looking into my port submission. I've loved being a user of FreeBSD for years, and I'm very excited to finally, even if only in a small way, be able to give back to our community <3 -- Rainbow

freebsdfrau commented 6 years ago

Shell scrips [sic] in this project contains a considerable amount of errors.

I found no errors mentioned by shellcheck, only warnings (yellow) and suggestions (green), neither of which are errors which appear in red.

Please use shellcheck to check your scripts.

shellcheck is not a good static analyzer for FreeBSD shell code. I and Jilles (author/maintainer of our /bin/sh) are the resident experts. That being said, shellcheck like most utilities can still bring some value if wielded properly.

ex: for ss in `find . -name "*.sh"`; do shellcheck $ss; done;

When we tell mentees to lint their code before commit, we usually give them the most appropriate command that will strike at the heart of anything that makes our lives difficult without putting too much burden.

For example, I tell all src/port/doc committers to lint their man-pages with the following:

igor -Dgpxy FILE

and:

mandoc -mdoc -Tlint FILE # for 11.0+ with mandoc mdoc -Tlint FILE # for older releases lacking mandoc

I also preface the use of igor with a caveat that the request features tend to generate some noise on the BSD license if contained within a file (shown below):

FILE style check: ["the following" used 3 times] If something is following, the reader can see it without being told.

The point being made here is that if you're going to tell someone to run a tool (e.g., shellcheck, mandoc, mdoc, igor), that you tell them the best way to invoke it to minimize noise, and if you can't minimize all noise while continuing to check the important things, you either filter-out the noise yourself or share the knowledge of what is noise and what isn't.

As it stands, shellcheck without any additional arguments when used against FreeBSD shell code (when written to the extacting standards of Jilles and myself) is so noisy that it is almost worthless if you're going to use it as a PASS/FAIL metric.

All that being said, @RainbowHackerHorse has shared with me a set of command-line options for shellcheck that ought to be used to mitigate noise.

-e SC2034,SC2039,SC2166,SC2181,SC2153,SC2143,SC2154,SC2010

But let's back up a moment and take a look at some of the warnings produced by shellcheck.

^-- SC2034: errorfunc appears unused. Verify it or export it.

The "or export it" is bad advice to give to shell programmers. You should never blindly export variables just to satisfy some braindead static analyzer complaining about potentially unused variables. Exported variables count against your programs quota(1) and/or limit(1) limitations and you may find, if you export too much data, that you are suddenly no longer able to fork any programs from your script. Debugging such a case can feel like a denial-of-service issue because few shells accurately describe the issue when the limitation is hit (leaving you scratching your head).

This should be ignored as-suggested by @RainbowHackerHorse in the above shellcheck -e option.

^-- SC2039: In POSIX sh, read -p is undefined.

Here's the POSIX standard for read.

http://pubs.opengroup.org/onlinepubs/9699919799/utilities/read.html#tag_20_109

This backs up shellcheck's claim that POSIX does not support -p prompt.

However, FreeBSD's /bin/sh is not POSIX shell, and is actually closest to dash (the Debian Almquist Shell). FreeBSD has supported -p prompt since the original CSRG import of BSD 4.4 Lite sources by rgrimes, performed almost 24 years ago.

https://svnweb.freebsd.org/base/vendor/CSRG/dist/bin/sh/miscbltin.c?limit_changes=0&view=log&pathrev=1556

The feature is so old, it can drink liquor!

Anywho, you can squelch these warnings by informing shellcheck to treat the shell scripts as dash with the -s dash command-line option.

I would actually do that instead of adding SC2039 to the -e parameter above as originally suggested.

So far we are at the following suggested invocation of shellcheck:

shellcheck -s dash -e SC2034 FILE

Addressing the other misnomers which may have appeared scary but are not:

^-- SC2166: Prefer [ p ] || [ q ] as [ p -o q ] is not well defined.

Oh, so shellcheck wants me to run the risk that if [ (pronounced test) is not a shell built-in that my code should run slower by utilizing mulitiple invocations of [ instead of one.

This error should be ignored. I would recommend SC2166 always be ignored.

^-- SC2181: Check exit code directly with e.g. 'if mycmd;', not indirectly with $?.

That's a legit warning. I'd keep that enabled.

^-- SC2010: Don't use ls | grep. Use a glob or a for loop with a condition to allow non-alphanumeric filenames.

That too is a legit warning. Keep it enabled.

^-- SC2153: Possible misspelling: VOLNAME may not be assigned, but ZVOL_NAME is.

Warning is OK. Wouldn't disable this one. Usually not that noisy and the spurious warnings, usually few, are easily ignored on a case-by-case basis.

^-- SC2154: errorfunc is referenced but not assigned.

I am ambivalent about this one. It can be helpful unless noisy. However, the amount of false identifications should be small and like the above, should be easily skipped.

^-- SC2010: Don't use ls | grep. Use a glob or a for loop with a condition to allow non-alphanumeric filenames.

This is a legitimate warning. One should only fork a grep to look at the filesystem when a built-in glob (see glob(7) on Linux or fnmatch(3) on FreeBSD) cannot do. For example, if you need to match on "contains", "starts with", or "ends with", then a globbing pattern will do and can be done much faster than a fork to grep.

^-- SC2046: Quote this to prevent word splitting. ^-- SC2143: Use grep -q instead of comparing output with [ -n .. ].

These were found to be legitimate.

Ok, so now we have a suggested invocation for shellcheck for FreeBSD shell code:

shellcheck -s dash -e SC2034,SC2166 FILE

I think that's pretty good to recommend. A lot better than just saying "run shellcheck".

Also, I found the following to be more helpful for larger projects:

for ss in `find . -name "*.sh"`; do printf "\033[34;1m==> $ss\033[39;22m\n"; shellcheck --color=always -s dash -e SC2034,SC2166 $ss; done 2>&1 > shellcheck.txt

Followed by

less -R shellcheck.txt

All scripts that have the shebang #!/bin/sh are expected to be Bourne shell scripts.

I didn't find a single bashism in @RainbowHackerHorse code. We should strive to point out real problems instead of imagined ones (read: make sure someone is using a bashism before you accuse them of such or go recommending such simply because it seems like a common problem to you).

Could you please correct the errors and create a release here? No need to update the FreeBSD patch. I will take it from here.

I have worked with @RainbowHackerHorse to resolve some of the real issues stated above (and some not), and will work with her by filing tickets.

I can commit to ports if/when it meets muster but will continue working with her to get it up to speed.

RainbowHackerHorse commented 6 years ago

Thank you @freebsdfrau for walking through things with me and helping me catch some stuff I did genuinely overlook or miss :)

Checking against master as of this moment, all of the aforementioned actual errors have been resolved, except:

SC2154: errorfunc is referenced but not assigned.

Again, this one is inaccurate as shellcheck just isn't seeing functions in sourced files in ./lib. I personally keep this off.

SC2181: Check exit code directly with e.g. 'if mycmd;',

In some cases, checking the error code the way I did was the only way I found to achieve the results I wanted. There is likely a better way to do it, and it may and likely will be fixed in a future release, but as it works at this time, it won't be updated by the next point release.

SC2010: Don't use ls | grep

This will be fixed after I finally get some sleep.

SC2153: Possible misspelling: VOLNAME may not be assigned, but ZVOL_NAME is.

Inaccurate warning in this case, as both variables are indeed assigned and I didn't fuck up my spelling. However, I could have, so it's good to double-check. :)

Now that we know what's good, what isn't, and what's awful but Works For Now, I'll be fixing SC2010, and then pushing out the next point release, 0.5.5

RainbowHackerHorse commented 6 years ago

and SC2010 is now fixed in master. The only outstanding issue is SC2181, and that's more a "you shouldn't really do this" thing, vs a "things dont work now" thing, so I'll be fixing that in a future release. As things stand, it should be good now :) Tagging 0.5.5, "You will cooperate, my son. I will make you, because I own you." (I'm not sure if you've noticed, but most of the releases are just really depressing Mr Robot quotes now)

yurivict commented 6 years ago

@RainbowHackerHorse and @freebsdfrau ,

Thanks for your attention to the matter and detailed explanations! I don't really take shellcheck very seriously, just decided to ask to make sure.

Cheers! Yuri

koalaman commented 6 years ago

ShellCheck author here! To clarify some of these issues (six months too late):

FreeBSD's /bin/sh is not POSIX shell,

Debian once said "/bin/sh is bash". Years later, they changed it to Dash.

This broke all the scripts -- system and users' -- that relied on Bash features while having a #!/bin/sh shebang. People could have saved themselves quite a bit of trouble by explicitly specifying the shell they need.

You can choose to hard code the assumption that /bin/sh will always be the shell it is right now, and that may be a fair and pragmatic design decision. You should just be aware that you're doing it.

The "or export it" is bad advice to give to shell programmers. You should never blindly export variables just to satisfy some braindead static analyzer

This is not to satisfy ShellCheck. This is if you thought the variable was in use by some program you call:

http_proxy="http://corpfwd/"   # Needs to be exported, otherwise it won't be used
wget "http://example.com"

so shellcheck wants me to run the risk that if [ (pronounced test) is not a shell built-in that my code should run slower by utilizing mulitiple invocations of [ instead of one.

ShellCheck doesn't want you to risk it being slower. It wants you not to risk it being wrong. Here's what POSIX says about the evaluation of [ -n foo -a -n bar ]:

>4 arguments: The results are unspecified.

See the rationale in POSIX test for more information. (If you chose to hard code the assumption about what /bin/sh is, then obviously you can disregard this -- although then you also know that [ is always builtin).

freebsdfrau commented 6 years ago

Debian once said "/bin/sh is bash". Years later, they changed it to Dash.

FreeBSD's /bin/sh has been FreeBSD's own /bin/sh for 25 years and that will never change (due to licensing).

This broke all the scripts ...

We don't have that problem in FreeBSD.

However, the problem we did have was @yurivict blindly stating to the void "All scripts that have the shebang #!/bin/sh are expected to be Bourne shell scripts." -- which might as well have been an insult considering that @RainbowHackerHorse didn't use a single bashism in her /bin/sh script.

You can choose to hard code the assumption that /bin/sh will always be the shell it is right now, ...

FreeBSD's /bin/sh is never going to change. That is not an assumption.

Devil's advocate: Even if Jilles and I were to replace it with something else, that something else would be 1/ BSD-licensed and 2/ backward compatible.

We don't pull stuff like: a. Deprecating SysV for systemd b. Deprecating ifconfig/route/arp/iptunnel/nameif for ip c. Deprecating iwconfig for iw d. Deprecating netstat for ss/ip e. Breaking "grep -r" by changing it to not follow symlinks and adding "grep -R" to get old behavior f. Breaking "tar xf file pattern*" to not work and adding --wildcards to get old behavior g. Breaking "find -perm +omode" and adding "-perm /omode" to get old behavior

If it works, we keep it. If it breaks, we fix it. If it becomes obsolete, we replace it with something backward compatible. If we can't do any of those, we delay changes until a major X.0 release and plaster statements everywhere along the roadside to give ample notification

You should just be aware that you're doing it.

When you write code for FreeBSD's /bin/sh you can rest-assured it will work everywhere else under /bin/sh and that is regardless of whether /bin/sh is bash, dash, ash, ksh, or whatever.

The "assumption" you speak of is only relevant to Operating Systems that install/link bash to /bin/sh

This is not to satisfy ShellCheck. This is if you thought the variable was in use by some program you call:

You should modify the error message to be more clear. Instead of saying "errorfunc appears unused. Verify it or export it." perhaps "errorfunc appears unused. export required if used externally"

Here's what POSIX says about the evaluation of [ -n foo -a -n bar ]: 4 arguments: The results are unspecified.

How is the result undefined exactly?

Not your fault that the POSIX page you linked to is brain-dead.

The results are not undefined. The results are very well defined.

I challenge you to find one example where the results are "undefined"

(If you chose to hard code the assumption about what /bin/sh is, then obviously you can disregard this -- although then you also know that [ is always builtin)

What do you mean by "hard code the assumption about what /bin/sh is"? This statement does not make sense to me. You either use /bin/sh and code for the lowest-common denominator, or you make a mistake.

It should be considered a mistake to use an invocation of /bin/sh and put any bashism in your script.

Bash scripts should use either #!/bin/bash or preferred #!/usr/bin/env bash

If you are writing a script for /bin/sh and you do not test on systems other than Linux (which use bash as /bin/sh), you're just asking for trouble.

It's not "hard coding an assumption", it's called "poor coding" and "poor testing".

Also, generically, if you don't know if you're about to use a bashism and /bin/sh is bash on your system, DO NOT use /bin/sh as your invocation. Just accept the fact that you don't know what you're doing and use an invocation of /bin/bash or /usr/bin/env bash -- because if you don't know if you've used any bashisms, your code will most assuredly not work in non-bash (so don't kid yourself).

koalaman commented 6 years ago

When you write code for FreeBSD's /bin/sh you can rest-assured it will work everywhere else under /bin/sh and that is regardless of whether /bin/sh is bash, dash, ash, ksh, or whatever.

Am I misreading this? A script that uses the non-POSIX features that FreeBSD sh provides is not guaranteed to work on other shells. The example ShellCheck warned about was read -p which fails in ksh93. Here's from the FreeBSD VM image:

root@:~ # uname -a
FreeBSD  11.1-RELEASE FreeBSD 11.1-RELEASE #0 r321309: Fri Jul 21 02:08:28 UTC 2017     
root@releng2.nyi.freebsd.org:/usr/obj/usr/src/sys/GENERIC  amd64
root@:~ # cat testscript
read -p 'Foo: ' foo
root@:~ # sh testscript
Foo: hello
root@:~ # ksh93 testscript
testscript[1]: read: no query process

You should modify the error message to be more clear. [...] perhaps "errorfunc appears unused. export required if used externally"

Agreed.

The results are not undefined. The results are very well defined. I challenge you to find one example where the results are "undefined"

Again I think I'm misunderstanding you. POSIX very clearly states that behavior is unspecified (unless you're arguing that "undefined" != "defined to be unspecified"). Obviously each shell has its deterministic and predictable result when evaluating such expressions, but the point is that this result can differ between shells.

For example, here's FreeBSD 11 sh:

# [ \( a -o a \) ]
# echo $?
0

And here's Solaris 10 sh:

# [ \( a -o a \) ]
test: argument expected
# echo $?
1

If you follow POSIX (and ShellCheck) and compose expressions using shell constructs, e.g. ( [ a ] || [ a ] ), this would not be an issue.

It should be considered a mistake to use an invocation of /bin/sh and put any bashism in your script. You either use /bin/sh and code for the lowest-common denominator, or you make a mistake.

This is also ShellCheck's opinion, so I'm surprised that you seemed to argue against its POSIX warnings. Are FreeBSD-sh-isms better than bashisms when both cause the script to fail in other implementations of sh? If so, why?

freebsdfrau commented 6 years ago

The example ShellCheck warned about was read -p which fails in ksh93.

That may be so, but according to the POSIX standard for read, it should only support "-r" ...

http://pubs.opengroup.org/onlinepubs/9699919799/utilities/read.html#tag_20_109

... and yet, ksh93 itself implements non-standard features. Here is the usage for read in ksh93:

Usage: read [-ACprsSv] [-d delim] [-u fd] [-t timeout] [-n count] [-N count]
            [var?prompt] [var ...]

So you see, it too supports a prompt:

ksh93# read "foo?Foo: "
Foo: abc
ksh93# echo $foo
abc

This is not a POSIX compliant feature either.

However, when it comes to non-compliancy, fixing an invocation of read to be portable is far different than rewriting your application because you used arrays or similar bashism.

And here's Solaris 10 sh:

The parens make a difference? Because shellcheck was complaining about [ test -o test ] being better composed as ( [ test ] || [ test ] ) but above you demonstrate [ \( test -o test \) ] which is different because it causes the [ builtin to receive/parse the parens in its argument space, fundamentally changing the interpretation of the line (wherein outer parens create a subshell).

Are FreeBSD-sh-isms better than bashisms when both cause the script to fail in other implementations of sh? If so, why?

Honestly? Yes. FreeBSD-sh-isms are few and far-between. You discovered "read -p prompt" which is not actually specific to FreeBSD, it just happens to not be supported by ksh (yet is supported by ash, dash, bash, and FreeBSD sh). That to me says it is more of a ksh-ism to support "read var?prompt" since bash, ash, dash, and FreeBSD sh do not support that.

What FreeBSD-sh-isms might you be referring to that are not implemented in ash, dash, and bash?

Because the number of things bash has that don't exist in ash, dash, or FreeBSD sh are so numerous that they can't be listed here.

freebsdfrau commented 6 years ago

And here's Solaris 10 sh:

I tested with ksh93 on FreeBSD which replicated the test: argument expected error. However, my note about the parens is correct. If you remove them, it works as-expected and thus POSIX statement that the results are "unspecified" or "undefined" does not hold water. What's unspecified about "[ -n a -o -n b ]" (the same as "[ a -o b ]":

ksh93# [ a -o b ]
ksh93# echo $?
0
ksh93# [ "" -o "" ]
ksh93# echo $?
1
ksh93# [ -n "" -o -n "" ]
ksh93# echo $?
1
ksh93# [ -n a -o -n b ]
ksh93# echo $?
0

I make it a general rule to not use parens in the argument space of the [ builtin as it is not portable. Solaris and ksh93 demonstrate this.

^-- SC2166: Prefer [ p ] || [ q ] as [ p -o q ] is not well defined.

ShellCheck doesn't want you to risk it being slower.

If you follow POSIX (and ShellCheck) and compose expressions using shell constructs, e.g. ( [ a ] || [ a ] ), this would not be an issue.

Actual performance tests:

$ time sh -c 'n=250000; while [ $n -gt 0 ]; do ( [ a ] || [ a ] ); n=$(( $n - 1 )); done'

real    0m46.874s
user    0m7.524s
sys 0m36.931s
$ time sh -c 'n=250000; while [ $n -gt 0 ]; do [ a -o a ]; n=$(( $n - 1 )); done'

real    0m0.400s
user    0m0.396s
sys 0m0.005s

Of course, it's entirely because of the outer parens creating a subshell. If you don't need them, you can retain full speed:

$ time sh -c 'n=250000; while [ $n -gt 0 ]; do [ a ] || [ a ]; n=$(( $n - 1 )); done'

real    0m0.379s
user    0m0.371s
sys 0m0.007s

Of course, you cannot always just remove the parens if there are further boolean operands, depending on the operand. You can chain like-operands infinitum without using sub-shell parens to group them (e.g., test && test && test && test or test || test || test || test). It is only when you mix operands that you have to group them (e.g., ( test || test ) && test or ( test && test ) || test), however that comes at the extreme performance hit, and so should be avoided in-general and just separate the operands using braces (example below).

Using braces instead of parens saves you performance:

dteske@nplusfreebsd.sf vzvol $ time sh -c 'n=250000; while [ $n -gt 0 ]; do { [ a ] || [ a ]; }; n=$(( $n - 1 )); done'

real    0m0.379s
user    0m0.379s
sys 0m0.000s

Unfortunately, the average user won't know about this impact and thus when encountering SC2166: Prefer [ p ] || [ q ] as [ p -o q ] is not well defined may unintentionally introduce a performance hit if they have to group operands upon splitting a test and mistakingly opt for parens. As it stands though, SC2166 is misleading (as you can see ksh93 works fine.

koalaman commented 6 years ago

On what system is ksh93 installed as /bin/sh?

OpenBSD , Solaris and AIX all use ksh-like shells as /bin/sh. read -p fails in them.

Do the parens make a difference?

Yes it does. The fact that we even had to ask is why it can be useful to stick to well defined behavior rather than try to keep track of all the various implementations.

And yes, in all cases of such a rewrite, we're replacing test parameters with shell syntax, because the shell syntax is well defined while the test -a, -o, and anything with >4 parameters is not.

We're not talking about any/all shells, but strictly ones utilized as /bin/sh in modern (read: maintained) operating systems. I don't find many systems that use ksh93 as /bin/sh

I absolutely see the business reasons for assuming /bin/sh is Ash-compatible, because it covers all of FreeBSD, GNU, BusyBox and macOS, while giving you some extra useful features.

The rationale in that case should be "we focus our efforts on what matters the most" and not "everything supports it".

Unfortunately, the average user won't know about this impact

Detecting unnecessary parentheses around test expressions are on the roadmap for ShellCheck ^^

freebsdfrau commented 6 years ago

The fact that we even had to ask is why

I was asking rhetorhically as in the socratic method.

it can be useful to stick to well defined behavior rather than try to keep track of all the various implementations.

I think there is some confusion between [ \( ... \) ] vs ( [ ... ] ) which are completely different. When [ runs in the former instance, it sees ( and ) in its $@ versus the latter which first creates a sub-shell and then runs [ inside it. These two syntaxes are not interchangeable. They may appear to do the same thing on the surface, but it fundamentally changes the way the parser invokes the command.

And yes, in all cases of such a rewrite, we're replacing test parameters with shell syntax, because the shell syntax is well defined while the test -a, -o, and anything with >4 parameters is not.

Are you trying to say that there are /bin/sh variants that have a [ that doesn't support -a or -o? I've never met one.

Detecting unnecessary parentheses around test expressions are on the roadmap for ShellCheck ^^

That would be great