koalaman / shellcheck

ShellCheck, a static analysis tool for shell scripts
https://www.shellcheck.net
GNU General Public License v3.0
36.1k stars 1.76k forks source link

Use file extension to detect shell #1369

Closed ingydotnet closed 5 years ago

ingydotnet commented 5 years ago

Why doesn't shellcheck consider the file extension to decide which shell is in play? I couldn't find any issues related to this. I write a lot of bash libraries that have a .bash file extension. I don't have a good reason for a shebang in those files.

jasonkarns commented 5 years ago

👍 I agree. I typically follow the convention that executable scripts have a shebang, and sourced scripts have the appropriate .sh or .bash extension.

I would prefer to not add a shebang to scripts intended to be sourced, because it implies the scripts can be executed directly, which is not the case. I would love to see file extension used as a fallback for shell detection when a shebang isn't present.

ingydotnet commented 5 years ago

Perhaps we can learn from vim here, which considers (at least):

ie It might be valuable to also support a shellcheck directive like # shellcheck shell=ksh that has the final say.

matthewpersico commented 5 years ago

Um, https://github.com/koalaman/shellcheck/wiki/Directive#shell

ingydotnet commented 5 years ago

Oh cool. Thanks @matthewpersico.

That's not in man shellcheck for my 0.4.6 (and the wiki says it was added in 0.4.5). Only disable and source are listed as valid directives. Not sure if that's been fixed...

matthewpersico commented 5 years ago

Yeah,@ingydotnet, gotta get up to at least 0.5.0; lots of fixes and improvements over 0.4.x.

ingydotnet commented 5 years ago

BTW, I use 0.4.6 (by default) because that's what apt install shellcheck installs (by default) on Ubuntu 18.04. I'm not sure who is responsible for the man page. Perhaps Debian or Ubuntu. The copyright date in the man page is 2012-2015, so obviously not up to date for 2018.

ingydotnet commented 5 years ago

@matthewpersico sorry didn't see your last comment. I do also have a local install (via cabal) of 0.5.0 that some of my build systems use. It seems that cabal does not have man pages.

ngzhian commented 5 years ago

See #1317 , looks like Debian Sid has 5.0

jabberabbe commented 5 years ago

I could work on this, but what should the precedence be here? I think this could be an option:

  1. Directive
  2. Shebang
  3. File extension

I think that the shebang should have higher precedence than the file extension, because there are lots of bash scripts with #!/bin/bash shebang but .sh extension.

jasonkarns commented 5 years ago

I think that the shebang should have higher precedence than the file extension, because there are lots of bash scripts with #!/bin/bash shebang but .sh extension

Strong agree with the precedence order @jabberabbe suggested.

ingydotnet commented 5 years ago

As long as the file extension is considered, I'm happy.

koalaman commented 5 years ago

I have a collection of 4908 scripts from GitHub's popular projects, as identified as shell scripts by their search feature.

Of them:

So looking at the extension definitely seems to be a house rule and not an established convention.

jabberabbe commented 5 years ago

I've implemented this in my fork on branch iss1369-shell-from-file-extension and for who wants this functionality here is a patch that can be applied with git am on top of master: https://gist.github.com/jabberabbe/123fead56df37761cfcd38571d0e3aae

The second commit on my branch makes ShellCheck not emit SC2148 (missing shebang) and similar warnings about the shell type not being specified if it detects the shell from the extension.

@koalaman, I didn't understand if you want this feature to be merged or not. If you do, I can open a pull request today. Let me know :-)

koalaman commented 5 years ago

I do see the appeal, but since .sh is the de-facto extension when one exists, and 75% of them are bash not sh, the existing heuristic of defaulting to bash seems more accurate than guessing based on the extension.

This functionality would therefore probably be better off in a wrapper script or similar, for the minority that do use a .sh to signify "not bash", rather than being merged as described

jasonkarns commented 5 years ago

Has behavior changed since 0.6.0? Right now, without a shebang, one gets an error:

$ shellcheck etc/nodenv.d/install/autoalias.bash 

In etc/nodenv.d/install/autoalias.bash line 1:
recommended_aliases() {
^-- SC2148: Tips depend on target shell and yours is unknown. Add a shebang.

For more information:
  https://www.shellcheck.net/wiki/SC2148 -- Tips depend on target shell and y...

I don't think relying on the "default to bash" heuristic is a valid option, because, while it does still generate output for errors, it also means the check will never succeed.

I've resorted to adding the shell directive, but that's only an option in files I control. (Not to mention the general distaste of adding tool-specific comments to my source code.) Hence why I think leveraging the extension as a fallback would improve the usability of shellcheck.

And supplying the shell via --shell CLI flag means replacing shellcheck glob with 5 distinct invocations, each with a distinct glob and matching CLI flag.

ingydotnet commented 5 years ago

My ask here, is that in lieu of any other hints, a file ending with .bash be parsed as Bash by shellcheck.

I think I see the disconnect here. I'm not talking about executable files written in Bash. I would never add .bash to a file that was a bin command. I would always use a shebang line, in a file with no extension.

However I write a ton of Bash libraries/modules (whatever you want to call them) that are not executable and only meant to be sourced by other bash programs. These files have no shebang line because there is no use for one.

It might be apropos to compare to Perl. In Perl, executable scripts have a shebang line and usually no extension. Sometimes they have a .pl extension. This is the same with Bash. Scripts sometimes have a .sh extension, but usually no extension.

However, all Perl (reusable) modules have a .pm extension, and all my (reusable) Bash modules have a .bash extension.

This same analog could just as well be applied to Python or Ruby (though they don't have a separate executable extension).

Here's a recent project of mine following this pattern: https://github.com/complete-shell/complete-shell

As you can see I'm using # shellcheck shell=bash as a workaround for now.

Again, all I want is that the .bash extension be honored when there is no other clue. Obviously if there is a .bash extension and you are linting it with shellcheck, the code inside is Bash.

PS @koalaman how many of the 4908 are non-executable?

jasonkarns commented 5 years ago

^ same. All rbenv/nodenv/pyenv (and all rbenv derivatives) follow this convention for all hook scripts that are sourced by *env

Searching across my computer:

$ find / -name '*.sh' -print0 | xargs -0 grep -EL '^#!' | wc -l
138
$ find / -name '*.bash' -print0 | xargs -0 grep -EL '^#!' | wc -l
52

So roughly ~200 files on my system that have .sh or .bash extension without a shebang. (And no more than a dozen of them are authored by me.) not a huge number, but not insignificant (considering how relatively rare it is for scripts to be only sourced and not executed)

koalaman commented 5 years ago

@jasonkarns You do get a warning that shellcheck had to guess, but the file is otherwise treated as if you ran with -s bash.

@ingydotnet Are you saying that this logic should apply only to .bash files and not to .sh files?

jabberabbe commented 5 years ago

I agree with @ingydotnet and @jasonkarns. The .sh extension is very generic - and I think it should not be used to determine the shell, although using it for executable scripts is not very good practice - but the extensions .{ba,da,k}sh are used in shell libraries (e.g. oh-my-zsh and similar plugin managers) and can be considered instead of the shebang, that shouldn't be used in non-executable shell files.

jasonkarns commented 5 years ago

@koalaman it's not just a warning; that causes shellcheck to exit non-zero which breaks any ci pipeline

On Wed, Jan 16, 2019 at 1:56 AM jabberabbe notifications@github.com wrote:

I agree with @ingydotnet https://github.com/ingydotnet and @jasonkarns https://github.com/jasonkarns. The .sh extension is very generic - and I think it should not be used to determine the shell, although using it for executable scripts is not very good practice - but the extensions .{ba,da,k}sh are used in shell libraries (e.g. oh-my-zsh and similar plugin managers) and can be considered instead of the shebang, that shouldn't be used in non-executable shell files.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/koalaman/shellcheck/issues/1369#issuecomment-454672456, or mute the thread https://github.com/notifications/unsubscribe-auth/AAHUpMF2thbu6cJVLzNtkM0dhglI-Z-5ks5vDs0GgaJpZM4XyQNq .

ingydotnet commented 5 years ago

@koalaman Yes.

(Agreeing with @jabberabbe) I consider .sh to be generic (this file contains some kind of shell code), whereas .bash (and also .dash and .ksh I suppose, although I don't deal with them much) is specific.

ingydotnet commented 5 years ago

Just tried to make sure with latest cabal install:

> (shellcheck --version; echo 'echo "$RANDOM"   # Does this work?' > foo.bash; shellcheck foo.bash; echo "Exit status = $?")
ShellCheck - shell script analysis tool
version: 0.6.0
license: GNU General Public License, version 3
website: https://www.shellcheck.net

In foo.bash line 1:
echo "$RANDOM"   # Does this work?
^-- SC2148: Tips depend on target shell and yours is unknown. Add a shebang.

For more information:
  https://www.shellcheck.net/wiki/SC2148 -- Tips depend on target shell and y...
Exit status = 1
koalaman commented 5 years ago

@jasonkarns I'm surprised you're finding so many shebang-less *.bash files. Where are they coming from?

It's totally plausible that my data set is severely skewed. Maybe GitHub search relies heavily on the shebang to recognize file types?

Out of my 91 *.bash files, 83 had a bash shebang, 6 had a bats shebang, and only 2 had no shebang and therefore would benefit from this change (and they're both from pyenv).

2/4908 is obviously not much to base a convention on. In that case I would prefer a find . -name '*.bash' -exec shellcheck -s bash {} + for anyone following it to avoid others seeing issues like:

200/4908 would be more like it though. How can we get a better idea?

jasonkarns commented 5 years ago

Based on a super brief perusal based on the head contents (and some best guess heuristics based on filename and path location)

for the .sh files without shebangs:

~30 shell tab completion scripts (sourced) ~40 scripts sourced from npm run-script defined in package.json as "foo": ". ./script/foo.sh" (I can't fathom why they're sourced instead of invoked 🤷‍♂️) ~30 build pipeline scripts (some via make, some via npm, or travis config) that are invoked via bash <script> ~30 git metadata scripts (lots that are for updating author or commit info presumably via filter-branch; unsure how they are invoked but they don't have shebangs. assuming they're cat-piped to git filter-branch unless git takes the script as an arg?) ~20 scripts under openssl, mozilla, and node directories that look roughly related to build/configure/test functionality ~12 tkConfig.sh and tclConfig.sh scripts that I don't understand

also found another case that indicates the shebang+execute vs extension+source pattern: https://github.com/rspec/rspec-support/tree/master/script (found these as part of the system search)

for .bash files without shebangs

~20 test helper scripts sourced by bats test suites ~12 shell completion scripts sourced by bash_completion ~10 config scripts shipped by tools like fzf, rg, ag, etc sourced by bashrc ~10 lib scripts sourced by the main executable script ~20 hook scripts provided by various rbenv and nodenv plugins (sourced by rbenv/nodenv)

So at least the shell_completion community, rbenv/*env community, bats-using community, rspec maintainers, and a few core npm modules (acorn, emscripten, browserify, and a couple others) seem to be consistent in using file extension for sourced files distinct from extensionless executed files.

A couple guesses for why we don't see much of this on github:

since we don't want the extension to override the shebang (or the shell directive or --shell arg), but to only be a final fallback, we only care about the small pool of extensioned files that don't have shebangs.

jasonkarns commented 5 years ago

I'm not sure I follow. In what scenarios are we assuming a .bash file would be executed by python or zsh?

I could see the extension fallback still emitting the warning as it does now; but if an extension is provided, that just allows SC2148 to not cause nonzero exit. That would still indicate to humans that a shebang or directive is preferred. But would allow those who do follow this convention to leverage shellcheck in CI as desired. (I recognize the code change for this might be convoluted, and wouldn't suggest it as a main goal. but it would thread the needle for providing maximal utility for this somewhat rare scenario)

Not sure I follow this scenario either. Are we assuming that the script is sourced in CI but executed in "prod"?

I think that file extensions being important for language tooling isn't going to come as a surprise for most developers. So seeing behavior change if a file has .N as the final suffix wouldn't be too shocking (and I think this would be exceedingly rare, considering we still recognize that the vast majority of scripts have the shebang which would take precedence)

🤷‍♂️

ingydotnet commented 5 years ago

Ultimately, when does a file have a .bash suffix and not contain Bash syntax? Is there a single case in the world?

I realize that I'm probably in a minority, but I use Bash as a full-on programming language and publish full-on frameworks in it. I've also published frameworks in Perl, Python, Ruby and NodeJS. In all cases I organize the code into classes or libraries. Any executables are just small entry points into the system.

Some Bash framework examples:

So in my case (using Bash as a programming language rather than for one off automation scripts) most code is going to be in non-executable files.

Remember when JavaScript was just a "scripting language" with bits here and there and no real organization? Now it has a thriving ecosystem. I hope to see more people using Bash in this regard, and shellcheck is one of the most useful tools in the Bash ecosystem.

But again, when is a .bash file not Bash? Certainly never in this brave new world! :-)

koalaman commented 5 years ago

Ok, I'm convinced. Let me have a look at the suggested fix later today.

@jasonkarns When executing a *.bash file without a shebang, most invocations including by Python and Zsh (and C and Dash) will use sh as the interpreter. Bash is the only one to use bash as an interpreter in this case. Problems due to missing shebangs is obviously a problem with the script and not the invoking program.

It's also not uncommon to keep a script around as foo.sh and then have an install script copy it to /usr/bin/foo. However, this is arguably more in favor of this approach than against it: clearly people felt it was more useful for the rest of their tooling to have an extension during development.

jabberabbe commented 5 years ago

So, should I add a commit on my branch to not consider .sh extensions and open a pull request?

jasonkarns commented 5 years ago

@koalaman ah, thanks for explaining!

koalaman commented 5 years ago

@jabberabbe Please do! It would also be nice if the extension check was what defaulted to Bash (when no other extension was found), rather than the explicit one in determineShell fallbackShell t = fromMaybe Bash

jabberabbe commented 5 years ago

@koalaman, the problem here is that the shellTypeSpecified field of the Parameters record, should be true if the shell type was determined from the extension, but if I change the type of the asFallbackShell to Shell instead of Maybe Shell (and set it to Bash in the shellFromFilename function) then there is no way to tell if the fallback shell was the one determined from the extension or just the default (Bash) in makeParameters (AnalyzerLib.hs). But these technical details should be discussed in a pull request and not in an issue. I'll give you push access to the branch of my fork so that you can do what you think is better -- I'm quite new to Haskell.

ingydotnet commented 5 years ago

Thanks all. Go shellcheck!

steshaw commented 5 years ago

This is no longer working. Should I raise a separate issue?

$ echo >foo.bash
$ shellcheck foo.bash

In foo.bash line 1:

^-- SC2148: Tips depend on target shell and yours is unknown. Add a shebang.

For more information:
  https://www.shellcheck.net/wiki/SC2148 -- Tips depend on target shell and y...
$ shellcheck --version
ShellCheck - shell script analysis tool
version: 0.6.0
license: GNU General Public License, version 3
website: https://www.shellcheck.net

Perhaps it was deprecated?

ingydotnet commented 5 years ago

@steshaw this has not been released yet. 0.6.0 went out before this fix.

Freed-Wu commented 5 months ago

Modeline

eg # vim: ft=sh:

Now shellcheck have its modeline shellcheck: shell=XXX.

Can shellcheck recognized vim modeline? Just search ft=ksh, ft=sh, (usually in last line) ... to detect shell. It should be more convenient for those scripts with vim modeline -- users don't need to add new shellcheck modeline.

ingydotnet commented 5 months ago

I'm not sure this is a good idea. I recall it being common to see bash files (with no .bash extension) using ft=sh.

I also recall there not being a bash filetype, even though it seems to work for me now.

https://superuser.com/questions/926432/apply-full-vim-colorization-for-bash-scripts-that-have-no-shebang-line#comment1253226_926438 seems to back that up in 2015.