mvdan / sh

A shell parser, formatter, and interpreter with bash support; includes shfmt
https://pkg.go.dev/mvdan.cc/sh/v3
BSD 3-Clause "New" or "Revised" License
7.21k stars 339 forks source link

syntax: consider a language variant for pre-POSIX shells #757

Open Julien-Elie opened 2 years ago

Julien-Elie commented 2 years ago

Hi all,

shfmt currently changes `command` into $(command). Couldn't an option be added to disable all kind of syntax substitutions, and therefore keep the current syntax?

The rationale is that $(command) is not portable; it for instance fails on Solaris 10:

$ showrev -c /bin/sh | grep version 
Command version: SunOS 5.10 Generic 142251-02 Sep 2010
$ echo $(echo test)   
syntax error: `(' unexpected
mvdan commented 2 years ago

The $() syntax is part of POSIX shell, and I imagine it has been that way for decades. The latest version I am aware of, published in 2008, actively discourages the use of backticks too.

What shell are you running? It seems to not be POSIX compliant, so then using shfmt in its POSIX mode doesn't seem to be what you want.

Julien-Elie commented 2 years ago

Indeed, all the modes remove backticks so the POSIX mode won't be of help. I am speaking of the stocked Bourne shell shipped with Solaris 10. Yes, I know it is 11 years old (2010) but still here.

https://www.gnu.org/software/autoconf/manual/autoconf-2.69/html_node/Shell-Substitutions.html Autoconf says the $(command) syntax "unfortunately is not yet universally supported", so that's why I asked whether such substitutions could be deactivated if needed by a special flag passed to shfmt. I would have liked my scripts to be the most portable as possible, and therefore not being changed. Of course shfmt can emit a suggestion of syntax improvement, which would be great.

I really like all the formatting shfmt does. I'm just wishing a mode where it does only formatting.

mvdan commented 2 years ago

I see. I guess this is then a feature request to add support for "bourne shell" or other forms of very old shells that aren't POSIX compliant. How many of those are there, and how widely used are they?

I'm inclined to say that shouldn't be a priority, with the assumption that they are not common. The project implements POSIX at its core, anyway. I don't want to make the default to keep backticks as they are, because that goes directly against the current POSIX spec, as well as making the formatter less useful for the majority of users.

Julien-Elie commented 2 years ago

Sure, the default wouldn't change. The title of this feature request is "provide a way to disable all syntax substitutions", not making it the default. I don't know whether there are other syntax changes (I've spotted only that one so far) but I thought maybe other people may have wanted a formatting-only mode.

mvdan commented 2 years ago

Right, "add one more flag" is always an option, but I have to pay an increasing price for additional flags in the long term. Testing all edge cases with more flags becomes an exponentially harder problem. So, for now, I don't think that cost is worth it for a flag that just makes the formatter do less of what it's intended to do.

On the other hand, adding another language variant (AKA shell dialect) is easier, as it's already part of an existing flag, and isn't such an invasive change in terms of introducing more flag combinations to worry about. What I don't see is what we would call that dialect, other than pre-posix or sh-noposix. And whether adding that feature would be worthwhile - I believe you're the first person in years to bring up this topic.

Julien-Elie commented 2 years ago

Yes an -ln=pre-posix option would achieve that, indeed.

mvdan commented 2 years ago

What would we call such a variant in the Go API? PrePOSIX? Bourne? Is there some other name that accurately describes this kind of old syntax?

Are there any other old shells besides bourne shell which would fall under the same category?

The flag name would then follow that Go name, e.g. pre-posix or bourne.

Julien-Elie commented 2 years ago

PrePOSIX looks good to me. Better be more generic than only Bourne. The point is that the option just reformats without applying any code modification.

Julien-Elie commented 2 years ago

Hi Daniel, FWIW I've just seen that the GNU Config project reverted last month the use of POSIX $( ) with classic backticks to keep recognizing old machine types with no POSIX shells, and Solaris 10. So there are still shell scripts in the wild that should be kept pre-posix.

https://git.savannah.gnu.org/cgit/config.git/commit/config.guess?id=d70c4fa934de164178054c3a60aaa0024ed07c91

mvdan commented 2 years ago

That's horrifying :) But good to know. I still lean towards doing this, I'm just currently busy with other open source work. Perhaps you would like to contribute a PR? Initial support would be as simple as adding a PrePOSIX variant that behaves like POSIX throughout the parser and printer, with the exception that command substitutions would use backticks.

SuperSandro2000 commented 2 years ago

dumb question: Does Go even work on that tombstone?

Julien-Elie commented 2 years ago

dumb question: Does Go even work on that tombstone?

Yes, the shfmt source code is POSIX-compliant. It can build fine. Adding support for not applying any code change in a pre-posix shell script does not break Go (it already knew how to parse such shell scripts).

mvdan commented 2 years ago

I think Sandro meant whether a system stuck with a pre-POSIX shell can run a Go toolchain or Go programs, given that Go requires minimum kernel and libc versions, and often isn't tested on ancient systems even if they meet those requirements.

Julien-Elie commented 2 years ago

Ah, OK. The system that runs a Go toolchain or Go programs is not the same as the system that runs the pre-POSIX shell script. Like the config.guess script quoted above; the GNU Config project maintains it (and can run shfmt on it if they want). This script is meant to be executed in other systems.