proposal: add option to wrap lines to fit in N columns on a best-effort basis

dnwe commented 7 years ago

e.g.,

#!/bin/sh

curl -s http://cdn.example.com/path/to/file.tgz | tar zxf - -C /usr/src --stdout && cd /usr/src && ./configure --prefix=/usr && make install

=>

#!/bin/sh

curl -s http://cdn.example.com/path/to/file.tgz \
    | tar zxf - -C /usr/src --stdout \
        && cd /usr/src \
        && ./configure --prefix=/usr \
        && make install

mvdan commented 7 years ago

I don't follow what you mean. Assuming you're talking about shfmt, if what you want is to break pipes and AND/ORs you can simply do it by hand and shfmt will keep the format.

If you're talking about shfmt always enforcing newlines, that would be annoying in most cases.

dnwe commented 7 years ago

@mvdan the proposal was to have an (optional) cmdline arg which would make shfmt attempt to wrap long lines to fit within 80 columns, line breaking at natural break points such as pipes, logic, ; etc.

mvdan commented 7 years ago

80 columns

If this is fixed to 80 columns, I'm opposed. This would be a holy war like the indentation one.

As for wrapping lines after N columns, I don't think I'm behind it either. Some lines won't be possible to wrap (e.g. heredocs, long strings). Some we can't know if the developer wants to wrap at all, like commands with lots of arguments.

And in some cases, the appropriate wrapping could be subjective to the developer's taste and style, and it's likely that the tool would not get it right.

Since this is something that you would run once, and not before every commit (like the regular shfmt), I'm inclined to decline this. A human could do a better job with their judgement, and this work need only be done once.

dnwe commented 7 years ago

Cool, sounds reasonable. Closing.

coriolinus commented 6 years ago

I have a use case for this: we're using a python program at work which parametrically generates a bunch of shell scripts which collectively manage a system which can be thought of as a black box. The scripts work as advertised, but they're not exactly human-friendly: the output tends to be files which are four lines long, each line of which is 500 characters wide.

It doesn't have to be default behavior--I'd be quite happy to see it locked behind command-line flags--but it would be great if shfmt were able to at least attempt to break lines in reasonable places. It may not always be possible to stay within a strict 80 chars, but it would be great if we could at least get these scripts into a state in which viewing them in portrait orientation makes more sense than landscape.

edit: Why ask shfmt to do this? If this tool is already building an AST, it's better-positioned to know where a reasonable line break would actually be than anything simpler.

edit 2: Why not just change the Python to generate prettier code? The Python script doesn't have an AST either; it just agglomerates options and flags and variables and commands as strings, adding things front and rear, until each line is complete.

mvdan commented 6 years ago

@coriolinus apologies that I'm replying a bit late - was on a short holiday when your comment was posted.

Your use case is interesting, I hadn't considered it directly. Have you tried running the existing shfmt on said source code? It should already do certain things, like not allow multiple commands on a single line, so I assume it should do something already.

Why ask shfmt to do this? If this tool is already building an AST, it's better-positioned to know where a reasonable line break would actually be than anything simpler.

I'm going to play the devil's advocate here; it's trivial to write a small Go program that would obtain the AST. For example, parsing standard input and obtaining the AST is as simple as syntax.NewParser().Parse(os.Stdin, ""). So I don't buy the argument that shfmt is better positioned to do this.

It would certainly be slightly less code to add it as a flag to the existing tool, but I don't want to make this tool a catch-all of features and somewhat random utilities. You could say that this feature belongs in a formatter, but I think it's a grey line. I'd like to see how far we can go without adding more flags to the nice and simple tool that we have so far.

mvdan commented 6 years ago

I'm going to temporarily reopen this for a couple of weeks, to see what omes out of it.

coriolinus commented 6 years ago

No worries about the delay; everyone is allowed to take vacations.

There's a slight difference in perspective that comes with experience: while I know some go, I haven't played around at all with its syntax package. It may in fact be simple to do on my own, but I'd have to do some research in order to get there.

It sounds like the key question, to you, is: "is automatic line breaking a core component of a code formatting utility, or a random feature?" I submit that it's a core component. autopep8 does it. rustfmt does it. gofmt does not, but then go is a particularly opinionated language, and per "effective go" there is no max line length because we all have infinitely wide screens in this modern world.

Put otherwise: in a perfect world in which every single task has a tiny unixy program which does exactly that task and no more, which program would you expect to be able to automatically break lines reasonably in shell scripts? I'd personally go looking for one named something like shfmt.

Switching topics: I have indeed already run shfmt on the generated output, but I don't remember off the top of my head what the results were. I'm confident that it didn't substantially reduce the average line length, because if it did, I wouldn't have gone looking for an issue like this. If you think it'll help, I could see about getting ahold of one of the diffs it produced and then redacting all the NDA'd stuff.

On Tue, May 1, 2018 at 5:16 PM, Daniel Martí notifications@github.com wrote:

@coriolinus https://github.com/coriolinus apologies that I'm replying a bit late - was on a short holiday when your comment was posted.

Your use case is interesting, I hadn't considered it directly. Have you tried running the existing shfmt on said source code? It should already do certain things, like not allow multiple commands on a single line, so I assume it should do something already.

Why ask shfmt to do this? If this tool is already building an AST, it's better-positioned to know where a reasonable line break would actually be than anything simpler.

I'm going to play the devil's advocate here; it's trivial to write a small Go program that would obtain the AST. For example, parsing standard input and obtaining the AST is as simple as syntax.NewParser().Parse(os.Stdin, ""). So I don't buy the argument that shfmt is better positioned to do this.

It would certainly be slightly less code to add it as a flag to the existing tool, but I don't want to make this tool a catch-all of features and somewhat random utilities. You could say that this feature belongs in a formatter, but I think it's a grey line. I'd like to see how far we can go without adding more flags to the nice and simple tool that we have so far.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/mvdan/sh/issues/80#issuecomment-385696742, or mute the thread https://github.com/notifications/unsubscribe-auth/AHdeTqSBIFRSCORUVkCKouvGfqAJ3VKiks5tuHxGgaJpZM4M7_D3 .

mvdan commented 6 years ago

Just to clarify; the package I meant is mvdan.cc/sh/syntax - it's the package in this repository that implements shell syntax parsing and printing (formatting), the core of shfmt.

You bring up a good point about most other formatting tools having this as a feature. How exactly do they implement this? Which works best? E.g. are they a best-effort heuristic, where you simply give it a number of columns to try to stick to? Are they smart to any degree?

The reason I was suggesting a separate tool is because, to me, this sounds like "de-minifying". shfmt has a minification flag, that basically reduces the amount of bytes a shell script takes. You could say "then the opposite should belong in the same tool", but I'd argue that de-minification is a much more complex process than minification.

Minification basically destroys data, and de-minification tries to style some code in a way that a human would. So it would likely have at least a hundred lines of code just to get that to a state that would make sense to most developers.

I realise that you're asking for something simpler than that; you just want line wrapping at a certain cutoff. But I imagine that the feature would slowly move towards said de-minification, at which point there's some benefit to not having it be part of shfmt.

coriolinus commented 6 years ago

Ah, good point. I was in fact confused about which package you meant. I still haven't worked with the syntax package at all.

I have no idea how other formatting tools handle this problem; I've been a user, not a developer. I'd imagine that it has to be a best-effort heuristic which allows for failure, because it's trivial to overwhelm any line wrapper by inserting a long enough literal.

Here's how I'd attempt to approach the problem, in pseudocode:

line_index = 1
while line_index < program.num_lines()
    if line.len() > desired_line_len:
        breakpoint = sorted([
            (breakpoint_goodness(bp), bp)
            for bp in line.breakpoints()
        ])[0][1]
        if breakpoint is not None:
            line.break_at(bp)
    line_index += 1

That loop doesn't get you much without a decent breakpoint_goodness function, so here's a first draft in pseudocode:

def breakpoint_goodness(breakpoint):
    # lower goodnesses are better
    goodness = abs(breakpoint.position - desired_line_len)
    if breakpoint.previous_token in ('&&', '||', ';'):
        # we can break after these tokens without escaping a newline char, so they're nice
        goodness -= 15 # this can be tweaked as necessary for good-looking code
    return goodness

Obviously this is first draft stuff, and there's at least one spot where (if this were Python) there would be a runtime error the first time it hit a real test suite, but I'm sure you get the point. Sure, there's plenty of room to get very fancy with this and de-minify the code into something like a human would write, but we can both agree that that's out of scope for this issue.

mvdan commented 6 years ago

@coriolinus if such a feature was added in shfmt, it would likely be a very simple heuristic - I don't want the logic to be very complex. We'd have to see.

Do you have an example of such a program with very long lines that you can share?

coriolinus commented 6 years ago

#!/bin/sh
# This file was automatically generated by gen_nodes.py.
# DO NOT EDIT

NDAUHOME="/Users/coriolinus/.multinode/chaosnode_0/ndau" TMHOME="/Users/coriolinus/.multinode/chaosnode_0/tendermint" docker-compose -f /Users/coriolinus/go/src/github.com/oneiro-ndev/chaosnode/docker-compose.yml -f /Users/coriolinus/.multinode/chaosnode_0/docker-compose.yml -p chaosnode_0 --project-directory /Users/coriolinus/go/src/github.com/oneiro-ndev/chaosnode down --remove-orphans
NDAUHOME="/Users/coriolinus/.multinode/chaosnode_1/ndau" TMHOME="/Users/coriolinus/.multinode/chaosnode_1/tendermint" docker-compose -f /Users/coriolinus/go/src/github.com/oneiro-ndev/chaosnode/docker-compose.yml -f /Users/coriolinus/.multinode/chaosnode_1/docker-compose.yml -p chaosnode_1 --project-directory /Users/coriolinus/go/src/github.com/oneiro-ndev/chaosnode down --remove-orphans
NDAUHOME="/Users/coriolinus/.multinode/chaosnode_2/ndau" TMHOME="/Users/coriolinus/.multinode/chaosnode_2/tendermint" docker-compose -f /Users/coriolinus/go/src/github.com/oneiro-ndev/chaosnode/docker-compose.yml -f /Users/coriolinus/.multinode/chaosnode_2/docker-compose.yml -p chaosnode_2 --project-directory /Users/coriolinus/go/src/github.com/oneiro-ndev/chaosnode down --remove-orphans
NDAUHOME="/Users/coriolinus/.multinode/chaosnode_3/ndau" TMHOME="/Users/coriolinus/.multinode/chaosnode_3/tendermint" docker-compose -f /Users/coriolinus/go/src/github.com/oneiro-ndev/chaosnode/docker-compose.yml -f /Users/coriolinus/.multinode/chaosnode_3/docker-compose.yml -p chaosnode_3 --project-directory /Users/coriolinus/go/src/github.com/oneiro-ndev/chaosnode down --remove-orphans
NDAUHOME="/Users/coriolinus/.multinode/chaosnode_4/ndau" TMHOME="/Users/coriolinus/.multinode/chaosnode_4/tendermint" docker-compose -f /Users/coriolinus/go/src/github.com/oneiro-ndev/chaosnode/docker-compose.yml -f /Users/coriolinus/.multinode/chaosnode_4/docker-compose.yml -p chaosnode_4 --project-directory /Users/coriolinus/go/src/github.com/oneiro-ndev/chaosnode down --remove-orphans
NDAUHOME="/Users/coriolinus/.multinode/chaosnode_5/ndau" TMHOME="/Users/coriolinus/.multinode/chaosnode_5/tendermint" docker-compose -f /Users/coriolinus/go/src/github.com/oneiro-ndev/chaosnode/docker-compose.yml -f /Users/coriolinus/.multinode/chaosnode_5/docker-compose.yml -p chaosnode_5 --project-directory /Users/coriolinus/go/src/github.com/oneiro-ndev/chaosnode down --remove-orphans
NDAUHOME="/Users/coriolinus/.multinode/chaosnode_6/ndau" TMHOME="/Users/coriolinus/.multinode/chaosnode_6/tendermint" docker-compose -f /Users/coriolinus/go/src/github.com/oneiro-ndev/chaosnode/docker-compose.yml -f /Users/coriolinus/.multinode/chaosnode_6/docker-compose.yml -p chaosnode_6 --project-directory /Users/coriolinus/go/src/github.com/oneiro-ndev/chaosnode down --remove-orphans
NDAUHOME="/Users/coriolinus/.multinode/chaosnode_7/ndau" TMHOME="/Users/coriolinus/.multinode/chaosnode_7/tendermint" docker-compose -f /Users/coriolinus/go/src/github.com/oneiro-ndev/chaosnode/docker-compose.yml -f /Users/coriolinus/.multinode/chaosnode_7/docker-compose.yml -p chaosnode_7 --project-directory /Users/coriolinus/go/src/github.com/oneiro-ndev/chaosnode down --remove-orphans
NDAUHOME="/Users/coriolinus/.multinode/chaosnode_8/ndau" TMHOME="/Users/coriolinus/.multinode/chaosnode_8/tendermint" docker-compose -f /Users/coriolinus/go/src/github.com/oneiro-ndev/chaosnode/docker-compose.yml -f /Users/coriolinus/.multinode/chaosnode_8/docker-compose.yml -p chaosnode_8 --project-directory /Users/coriolinus/go/src/github.com/oneiro-ndev/chaosnode down --remove-orphans
NDAUHOME="/Users/coriolinus/.multinode/chaosnode_9/ndau" TMHOME="/Users/coriolinus/.multinode/chaosnode_9/tendermint" docker-compose -f /Users/coriolinus/go/src/github.com/oneiro-ndev/chaosnode/docker-compose.yml -f /Users/coriolinus/.multinode/chaosnode_9/docker-compose.yml -p chaosnode_9 --project-directory /Users/coriolinus/go/src/github.com/oneiro-ndev/chaosnode down --remove-orphans

xzhub commented 6 years ago

I vote to add such option because it saves human labor to wrap each lines to achieve better code readability. Many (if not all) linter have this option. For example autopep8(https://github.com/hhatto/autopep8) has an option "--max-line-length" default to 79. Pylint, Swiftlint, TSlint and clang-format all have this option.

mvdan commented 6 years ago

I might add this as part of the new multi-command gosh tool - see #330. This is more of a utility tool that one would run a few times rather than a formatting flag used for an entire project, so it fits that program better than shfmt.

For example, we could call it gosh truncate, since we'd be truncating lines to a maximum of N bytes or characters.

wavefancy commented 5 years ago

I have a use case for this: we're using a python program at work which parametrically generates a bunch of shell scripts which collectively manage a system which can be thought of as a black box. The scripts work as advertised, but they're not exactly human-friendly: the output tends to be files which are four lines long, each line of which is 500 characters wide.

It doesn't have to be default behavior--I'd be quite happy to see it locked behind command-line flags--but it would be great if shfmt were able to at least attempt to break lines in reasonable places. It may not always be possible to stay within a strict 80 chars, but it would be great if we could at least get these scripts into a state in which viewing them in portrait orientation makes more sense than landscape.

edit: Why ask shfmt to do this? If this tool is already building an AST, it's better-positioned to know where a reasonable line break would actually be than anything simpler.

edit 2: Why not just change the Python to generate prettier code? The Python script doesn't have an AST either; it just agglomerates options and flags and variables and commands as strings, adding things front and rear, until each line is complete.

Vote a flag to have this function. Exactly the same needs as you proposed. :)

mvdan commented 4 years ago

With EditorConfig support implemented in https://github.com/mvdan/sh/issues/393, I think this feature fits more naturally into the tool, since EditorConfig already has max_line_length. The feature is still on the radar, it's just hard to get right. I'm attempting a very similar feature for Go in https://github.com/mvdan/gofumpt/pull/70, so I hope that I'll be able to reuse some of the lessons learned there.

bboysnick5 commented 2 years ago

Can we have a revisit on this. Would love to see it implemented. Thanks.

mvdan commented 2 years ago

This is an open source side project that I do for free on my spare time. If you want me to work faster, consider becoming a sponsor.

tmillr commented 1 year ago

I'd really like to see this implemented as well, even if it is provided only as experimental and/or optin behavior, and even if it does not become apart of the default behavior at all. See my discussion #992 for some of my ideas and reasoning 😄.

I don't have any experience with Go atm; perhaps someone could open up a pr to get the ball rollin?

noperator commented 2 months ago

https://github.com/noperator/sol is a CLI tool that follows the model that @mvdan originally proposed—uses mvdan.cc/sh/v3/syntax package internally to build an AST and then inserts line breaks wherever you specify.

For example, use sol -b -w 80 to format the command provided in the opening comment of this issue:

# original one-liner
curl -s http://cdn.example.com/path/to/file.tgz | tar zxf - -C /usr/src --stdout && cd /usr/src && ./configure --prefix=/usr && make install

# break on binary commands (e.g., |, &&, etc.)
$ echo 'curl -s http://cdn.example.com/path/to/file.tgz | tar zxf - -C /usr/src --stdout && cd /usr/src && ./configure --prefix=/usr && make install' | sol -b

curl -s http://cdn.example.com/path/to/file.tgz |
    tar zxf - -C /usr/src --stdout &&
    cd /usr/src &&
    ./configure --prefix=/usr &&
    make install

# same, but only break when we reach 80 characters wide
$ echo 'curl -s http://cdn.example.com/path/to/file.tgz | tar zxf - -C /usr/src --stdout && cd /usr/src && ./configure --prefix=/usr && make install' | sol -b -w 80

curl -s http://cdn.example.com/path/to/file.tgz |
    tar zxf - -C /usr/src --stdout && cd /usr/src &&
    ./configure --prefix=/usr && make install

Thanks to those who've shared interest in this problem (@bboysnick5, @coriolinus, @dnwe, @tmillr, @wavefancy, @xzhub).

mvdan / sh

proposal: add option to wrap lines to fit in N columns on a best-effort basis #80