mvdan / sh

A shell parser, formatter, and interpreter with bash support; includes shfmt
https://pkg.go.dev/mvdan.cc/sh/v3
BSD 3-Clause "New" or "Revised" License
6.99k stars 333 forks source link

Disable formatting for file or range of lines #963

Closed raxod502 closed 1 year ago

raxod502 commented 1 year ago

Hi, I've encountered a situation where shfmt breaks a script by replacing working code with non-working code. This is problematic because if a user's editor enables shfmt then it will break their code.

The code in question:

if (( $+commands[gln] )); then
    ln=gln
fi

Which is reformatted to:

if (($ + commands[gln])); then
    ln=gln
fi

As such, I'm submitting a feature request for there to be some way to indicate in a file that it should be left alone by shfmt, or that a particular region of code should be left alone. This feature was previously requested in https://github.com/mvdan/sh/issues/602, but that issue was closed. I am opening a new issue because I don't believe that the reasons for closure cover all important use cases.

Previous discussion around allowing shfmt to be disabled for a region of code have usually ended with the statement that it would be better to fix whatever problem leads to the need to disable shfmt. However, the problem here is that this is a zsh script, and zsh is not likely to be fully supported anytime soon.

I anticipate that another objection is that the user should simply refrain from running shfmt on files which it is not compatible with. However, it is not reasonable for every user to know what files might experience problems in advance. For example, zsh scripts need not declare themselves as such, e.g. if they are meant to be sourced and therefore lack a shebang line.

It is much more sensible for the file to be annotated ahead of time with this information by whoever last worked on the file with a shfmt-capable editor, rather than every contributor having to find out for themselves. This is a standard pattern that is used in many code formatters and linters. For example, even in the notoriously uncompromising Black formatter for Python, this feature is supported.

Another objection is that we should just fix this specific issue. However, it is not the only issue. Other zsh-specific constructs cause problems for shfmt, and unless the plan is to support all of them (see https://github.com/mvdan/sh/issues/120), solving an arbitrary subset does not seem very useful to me.

With this feature, it would be safe to enable shfmt by default. This is what users of my Apheleia Emacs package do, for example. If shfmt cannot be safely enabled by default, I would have to remove support for it from Apheleia, which would be a step backwards for formatting standardization.

mvdan commented 1 year ago

Thanks for the detailed report. You anticipated pretty much all answers to this issue, so there's not that much for me to add :)

However, the problem here is that this is a zsh script, and zsh is not likely to be fully supported anytime soon.

Zsh support will be added. How soon is always tricky with OSS, as I don't want to give an ETA, but I'm working on it. It's especially hard to estimate given that zsh has no spec, so the amount of work is potentially huge. However, I do intend to have an experimental, and probably incomplete, version out soon.

I anticipate that another objection is that the user should simply refrain from running shfmt on files which it is not compatible with.

That is definitely your answer in the meantime. If you have a file containing zsh code, the file should say somewhere that it is zsh and not POSIX shell or Bash. If you simply call the file foo.sh and give it no shebang, in my opinion that's a bad idea. Editors and tools will have to guess at what shell syntax you're using, and many guesses will be wrong.

You can either call your file foo.zsh, or use a shebang like #!/usr/bin/zsh. Either should make shfmt notice that it doesn't support that shell language. Perhaps I'm missing something about your use case, but this feels very reasonable to me.

For example, even in the notoriously uncompromising Black formatter for Python, this feature is supported.

Interesting that other formatters support this kind of feature, I wasn't aware. I'm still reluctant to add it if its only purpose is a temporary measure until zsh is supported. Especially because I'll be left having to maintain and document the workaround practically forever.

raxod502 commented 1 year ago

You can either call your file foo.zsh, or use a shebang like #!/usr/bin/zsh. Either should make shfmt notice that it doesn't support that shell language.

In my testing, a zsh shebang does not prevent the file from being formatted:

% head -n1 radian-link
#!/usr/bin/env zsh
% shfmt -i 4 < radian-link > radian-link.formatted
% git diff --no-index radian-link radian-link.formatted                                                          
diff --git 1/radian-link 2/radian-link.formatted
old mode 100755
new mode 100644
index 3d1401f..356a0ba
--- 1/radian-link
+++ 2/radian-link.formatted
@@ -14,7 +14,7 @@ function panic {
     die "internal error: $@"
 }

-if (( $+commands[gln] )); then
+if (($ + commands[gln])); then
     ln=gln
 else
     ln=ln

Should I open a separate issue for that, or did I misunderstand something? I notice that the shebang you mentioned was #!/usr/bin/zsh, while mine is #!/usr/bin/env zsh (since the path to zsh varies on different systems, e.g. Linux versus macOS). Does that break the detection? Or is it perhaps an issue with when the file is being read from stdin?

mvdan commented 1 year ago

The shebang should support both; see https://github.com/mvdan/sh/blob/698a9680cd27b9a574ec6834e37c21bc85fd9f6a/fileutil/file.go#L16.

What I didn't catch is that your use case is formatting a file explicitly. I should have clarified that, when I spoke about shfmt skipping over shell files it doesn't support, that only happens when walking directories. You can try that with shfmt -l -w ., for example, which should format all shell files it can find - but it should skip any files detected as zsh.

When shfmt is given a file, either via stdin or as an argument, it always formats that file - it is being passed explicitly, after all, so it can't refuse. The shebang or filename may still be used to automatically detect the shell language, which defaults to Bash. In your case, we see zsh, but we don't support it, so we fall back to Bash.

One option for stdin or explicit files could be to always error if we see a shell shebang or filename extension which we don't support. At least that would be better than potentially breaking a script due to mismatched parsing or formatting. We would have to agree on some sort of standard error, though, because I imagine tools like yours want to then understand that the file can't be formatted instead of showing the error to the user.

Another option would be for you not to call shfmt on files which it doesn't support. Some editor integrations already do this; they only call shfmt on files which they detect to be POSIX shell, Bash, or mksh. This does mean you have to implement the detection, but arguably it's not particularly hard to implement for shell. For files with an explicit extension like .bash or .zsh we stop there; for any files with a .sh extension or no extension at all, we extract the shell from the shebang with the regular expression above.

Perhaps also worth noting that this could be a problem with other shell languages that implement their own somewhat-different syntax, like fish. But at least I'm not getting any feature requests about anything other than zsh.

raxod502 commented 1 year ago

Okay, that is fair. I will make downstream changes to ensure that shfmt is only run conditionally, for supported shells.