koalaman / shellcheck

ShellCheck, a static analysis tool for shell scripts
https://www.shellcheck.net
GNU General Public License v3.0
36.2k stars 1.77k forks source link

SC2096: `/usr/bin/echo` interpreter trick #2110

Open kartynnik opened 3 years ago

kartynnik commented 3 years ago

I prefer to notify users that a particular script is to be sourced, not executed on its own, via the following echo trick.

For feature requests:

Here's a snippet or screenshot that shows the problem:

#!/usr/bin/echo This script is supposed to be sourced like this: .
# shellcheck shell=sh

# variable and function definitions to be imported via sourcing

Here's what shellcheck currently says:

Line 1:
#!/usr/bin/echo This script is supposed to be sourced like this: .
^-- SC2096: On most OS, shebangs can only specify a single parameter.

Here's what I wanted or expected to see:

No errors detected!

As it can be seen, independently of the splitting behavior of the shebang arguments, echo works as expected here. I agree this is a marginal issue, but would be nice to have.

ChillerDragon commented 3 years ago

I do not know the echo trick but this does not throw shellcheck warnings:

#!/usr/bin/env echo
# shellcheck shell=sh
kartynnik commented 3 years ago

@ChillerDragon It doesn't because the original argument containing spaces is gone, but that undermines the whole idea. Now the script just outputs its name when executed rather than printing the helpful message (which, due to the path to the script being amended as the last argument, would look like This script is supposed to be sourced like this: . script_name.sh). One could use #!/usr/bin/false as an alternative in this case, with a comment underneath, but it's less prominent and requires the user to notice the nonzero exit code.

P.S.

I do not know the echo trick

That's OK since I consider myself the inventor of this trick and wouldn't mind for it to spread to the masses :smiley:. Jokes aside, I do think it's useful to prominently inform the user when a script like activate from virtualenv is used not as intended.

cmplstofB commented 3 years ago

file name is script_name.sh

#!/bin/sh

EXE_NAME='script_name.sh'
NOW_EXE=$(ps -p $$ -o args=)

case "$NOW_EXE" in *$EXE_NAME*)
 echo "Executed as a file."
esac

from: https://senooken.jp/post/2016/12/18/

kartynnik commented 3 years ago

@cmplstofB There is an assumption that the script is never renamed, which can be rather strong depending on the use case. Unfortunately there seems to be no portable way of performing such detection reliably on a POSIX shell without extensions, as illustrated by the discussion here: https://stackoverflow.com/questions/2683279/how-to-detect-if-a-script-is-being-sourced.

(Technically, it will also give a false positive if included, maybe even transitively, from another script_name.sh, however that probably is of marginal importance.)

kartynnik commented 3 years ago

(One could say that shebangs are also non-POSIX, which is true. But if shebangs don't work, they don't work for #!/bin/sh either, so there's less risk in incorrect script execution - unless you explicitly run it as an argument to the shell, that is.)

senooken commented 3 years ago

Hi. Thanks for watching my blog. For your information, I comment.

Also shebang is not POSIX. I think we should not use shebang for portable POSIX shell script. For example, Android has no /bin/sh, and Solaris 10 and Unixware use non-POSIX old Born shell as /bin/sh.

Instead of shebang, put : or non # on first 1 character in a file (empty line is ok). If there is no shebang and file is not ELF, file is run as shell script (fall back behavior). This is POSIX defined behavior. First # is interpreted csh from interactive csh. This is specified csh manual. That is all. Bye.

kartynnik commented 3 years ago

@senooken Right, I have mentioned that shebangs are non-POSIX. Isn't there a problem though that the fallback attempt at executing as a shell script will be performed with the user's default shell which can be whatever, like fish or even xonsh?

senooken commented 3 years ago

@kartynnik hi. You know. csh is non-POSIX. I am not familiar with xonsh. But apparently, fish is non-POSIX. For example, fish have no command command -p option. And old fish has no export builtin command.

I think csh and fish and so on are like cmd.exe on Windows. They are not POSIX sh. But they have a small same function.

FYI. fish cannot run without shebang by file name.

fish -c ./no-shebang.sh
Failed to execute process './no-shebang.sh'. Reason:
exec: Exec format error
The file './no-shebang.sh' is marked as an executable but could not be run by the operating system.

This is fish only problem (because fish is not POSIX sh).

no-shebang.sh is here.

ps | grep $$
echo $0
ps -p $$ -o args -o comm
awk 'BEGIN{system("./no-shebang.sh")}'
  36721 pts/1    00:00:00 sh
./no-shebang.sh
COMMAND                     COMMAND
/bin/sh ./no-shebang.sh     sh

Some shell (bash, ksh2020) run as itself shell. But many shell use sh. And system (OS) use sh.

If self is POSIX compatible sh, it is OK even if it is not used sh.

csh, fish, cmd.exe and so on are not POSIX and path of /bin/sh is not POSIX.

If user use non-POSIX interactive sh (csh, fish, cmd.exe etc.), should use POSIX sh first argument (like sh no-shebang.sh).

It was lucky for working shebang#!/bin/sh on non-POSIX sh.

kartynnik commented 3 years ago

@senooken Turns out that non-shebang script invocation really gets handled by the shell rather than the OS (and the awk example doesn't count since system() is about invoking the user's default shell, like running $SHELL -c):

$ python -c 'import os; os.execl("./no-shebang.sh", "./no-shebang.sh")'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/lib/python2.7/os.py", line 314, in execl
    execand the requirement of presence in `/etc/shells` can be circumvented by thv(file, args)
OSError: [Errno 8] Exec format error

Linux's execve(2) man page corroborates this:

execve() executes the program pointed to by filename. filename must be either a binary executable, or a script starting with a line of the form: #! interpreter [optional-arg]

But my point was that the user is free to change their interactive shell to whatever they like (via chsh - say, if zsh or exotic shells like fish or xonsh are installed, they do appear in /etc/shells), it need not be a POSIX-compatible one. So relying on non-shebang scripts being run under a Bourne-like shell is dangerous. (Maybe exactly because of intended Bourne shell incompatibility, fish does not implement this functionality for non-shebang scripts since they are unlikely to be fish scripts and what is the default Bourne-like shell on a given system is unclear.)

senooken commented 3 years ago

@kartynnik Sorry. awk system() example is not good. Because awk system() use C system() and sh -c (refer to awk, system)

I referred to following POSIX page.

My past research What is shebang (#!/bin/sh) in POSIX shell script – senooken.jp (Japanese).

I do not comment your Linux's execve(2), because it is out of POSIX.

From 1.e.i.b, if file is executed from sh, useexecl(). But if execl() has [ENOEXEC], sh run with sh again.

Otherwise, the shell executes the utility in a separate utility environment (see Shell Execution Environment) with actions equivalent to calling the execl() function as defined in the System Interfaces volume of POSIX.1-2017 with the path argument set to the pathname resulting from the search, arg0 set to the command name, and the remaining execl() arguments set to the command arguments (if any) and the null terminator.

If the execl() function fails due to an error equivalent to the [ENOEXEC] error defined in the System Interfaces volume of POSIX.1-2017, the shell shall execute a command equivalent to having a shell invoked with the pathname resulting from the search as its first operand, with any remaining arguments passed to the new shell, except that the value of "$0" in the new shell may be set to the command name. If the executable file is not a text file, the shell may bypass this command execution. In this case, it shall write an error message, and shall return an exit status of 126.

And if execlp and execvp execute script from OS directly(not execl), they have same fall back behavior.

There are two distinct ways in which the contents of the process image file may cause the execution to fail, distinguished by the setting of errno to either [ENOEXEC] or [EINVAL] (see the ERRORS section). In the cases where the other members of the exec family of functions would fail and set errno to [ENOEXEC], the execlp() and execvp() functions shall execute a command interpreter and the environment of the executed command shall be as if the process invoked the sh utility using execl() as follows:

execl(, arg0, file, arg1, ..., (char *)0);

You are right. If execl execute no-shebang script directly, it fails. If script has shebang, it may avoid failure.

But almost case, file is executed by sh. And user can use execlp and execvp instead of execl directly.

I am not familiar with execl. Where execl is used directly?

User use any shell. It is OK. Also shell includes non-POSIX shell (fish, xonsh, cmd.exe, powershell).

But if non-POSIX shell run any file, this is shell determined because of non-POSIX (including shebang behavior).

If user use non-POSIX and run by file name, I think user should specify sh command first argument.

If file has "#!" and run from sh, results are unspecified clearly on POSIX (Shell Command Language).

  1. The shell reads its input from a file (see sh), from the -c option or from the system() and popen() functions defined in the System Interfaces volume of IEEE Std 1003.1-2001. If the first line of a file of shell commands starts with the characters "#!", the results are unspecified.

I think it is trade off for support non-POSIX shell, Android, Solaris, Unixware, POSIX compliant.

I believe non-shebang shell script is true POSIX shell script.

kartynnik commented 3 years ago

@senooken Wow, the two things from POSIX you mentioned are new and unexpected to me: that the exec*p functions are bound to invoke sh on non-executables and that the treatment of a #! sequence by sh is unspecified even if the script is passed directly as an argument or from stdin. This is very confusing, making any shebang-containing script non-POSIX indeed. I would expect that sh treats this line as a comment - that's why # is there in the first place. Funny that POSIX doesn't guarantee shebang support while at the same time mentioning it somewhere else just to derail it 😃.

Thank you so much for the comprehensive write-up!

rdebath commented 3 years ago

The #! is an executable file format magic number. It is used to work around that lovely "feature" of the exec*p() functions calling a particular variant of the bourne shell when your script isn't a bourne shell script suitable for any ancient shell that may be installed as the default.

The fact that it's not run by the shell is the reason that "splitting and globbing" doesn't occur.

Nevertheless, in this instance the SC2096 error is incorrect not because it's textually wrong, but because it makes no difference to the output of the echo command. The echo command will give the same output even on old unix kernals that split the #! line.

This also applies to the -S option of the /usr/bin/env command which will (from at least 15 years ago) split on spaces if the kernel doesn't.

lifeModder19135 commented 2 years ago

@kartynnik Just use /usr/bin/env. It is a more robust option anyway. POSIX based terminals are supposed to require that a shell is invoked explicitly on line 1 via shebang, although it is up to the devs to decide whether to imtlement, and many choose not to do so.

env has an option whose singular purpose for existing is to let users pass whatever they desire in the shebang, finish up with a bash invocation, and still pass the requirements of all of the major shells. It even passes shell check. The option is --split-string , and it isn't a work-around by any means. env is a member of the core-utils package, which is built and maintained by GNU themselves, and not only the is the option included for your specific case, the entire command is. Here is what the man_page(1) says about the subject.:

**_-S_|_--split-string=S_**
 process  and split S into separate arguments; used to pass  multiple  arguments on shebang lines
          ...
**_-S_/_--split-string_ usage in scripts**
 The  -S  option allows specifing multiple parameters in a script.  Running a script named **1.pl** containing the following first line:
#!/usr/bin/env -S perl -w -T
 Will execute `perl -w -T 1.pl` .

 Without  the  `-S`  parameter, the script will likely fail with:
        ```
        /usr/bin/env: 'perl -w -T': No such file or directory
        ```
 See the full documentation for more details.

EDIT: Am I the only one who thinks the qoute syntax looks too much like code blocks? I'm in dark-mode, is it the same in light-mode?

Nevermind about it working with shell-check. It worked in the past, but of course, as soon as I brag about the feature, it breaks. This is a bug