rust-lang / rustfmt

Format Rust code
https://rust-lang.github.io/rustfmt/
Apache License 2.0
6.03k stars 889 forks source link

Limit chain length by number of calls as well as width #2263

Open vorner opened 6 years ago

vorner commented 6 years ago

Hello

I like to split my chains when they seem „complex enough“. That doesn't necessarily mean they are long in number of characters, but number of calls is a good indicator.

Eg. this still fits the width, but it's already hard to read due to too many chained calls on the same line:

self.0.take_action(Log).result().map_err(|()| LoggerError)

However, when split into multiple lines, it is easier, because the lines naturally invoke sequential thinking about what happens next.

Is it possible to have a complexity heuristic in addition to width heuristic for small items? And consider it large if it fails either?

BurntSushi commented 6 years ago

I've run into this sort of thing as well. Instead of heuristics, is it possible to permit the human to decide here? That is, if a chain is formatted across multiple lines by a human, then rustfmt could opt not to collapse it into a single line (if it otherwise would). I don't know if this sort of thing is a viable strategy though!

nrc commented 6 years ago

if a chain is formatted across multiple lines by a human, then rustfmt could opt not to collapse it into a single line (if it otherwise would)

We've tried to avoid letting the formatting of the original file affect the output formatting. We found that it led to lots of problems - it can lead to non-idempotence of output and if the input is not formatted at all then it can lead to ugly formatting

oxalica commented 5 years ago

What's the status of this issue?

I'm writing a HTTP client using chains, and rustfmt collapses the operations into one single line, which make it not clear enough.

I think at least it need to be configurable. (something like max_chain_calls_single_line)

Before:

pub fn delete_upload_session(&self, sess: &UploadSession) -> Result<()> {
    self.client
        .delete(&sess.upload_url)
        .send()?
        .parse_empty()
}

After (current version):

pub fn delete_upload_session(&self, sess: &UploadSession) -> Result<()> {
    self.client.delete(&sess.upload_url).send()?.parse_empty()
}
iopq commented 5 years ago

Mine is also very simple:

    let n: i32 = std::env::args()
        .nth(1)
        .map(parse)
        .unwrap_or(Ok(100))?;

I think I'd go with "third call is too much"

synek317 commented 5 years ago

Any news regarding this issue? Does it wait for a decision or just an implemention? Or maybe someone already works on it?

zskamljic commented 4 years ago

I am really bothered by this, assuming the input is:

let contents = fs::read_to_string(config.filename)
    .expect("Something went wrong reading the file");

or even:

let contents = fs::read_to_string(config.filename).expect("Something went wrong reading the file");

the lines would be formatted as following:

let contents =
    fs::read_to_string(config.filename).expect("Something went wrong reading the file");
lnshi commented 2 years ago

Eh, this is 2017 issue... I just submitted a new one: https://github.com/rust-lang/rustfmt/issues/5106, due to this i am really struggling adopting this tool.

May I know what are the special considerations that rustfmt cannot follow what gofmt had done: leave the line break and deal with blank line and indention only?

ssokolow commented 2 years ago

May I know what are the special considerations that rustfmt cannot follow what gofmt had done: leave the line break and deal with blank line and indention only?

@lnshi From what I remember, rustfmt parses the code into an Abstract Syntax Tree, not a Concrete Syntax Tree (possibly even reusing the compiler's parser), so information about the presence or absence of whitespace between tokens (i.e. outside strings and comments) is discarded during the parsing phase.

lnshi commented 2 years ago

@ssokolow so the implication is that there IS NOT an easy fix for this, coz it requires a fundamental change, or even a totally new project?

can someone pls comment? @topecongiro @nrc

lnshi commented 2 years ago

and even be able to limit the chain length still cannot fix the cases i mentioned here: https://github.com/rust-lang/rustfmt/issues/5106, developer groups the values semantically in different lines, or group the calls semantically in different lines.

let m = HashMap::from([
  (a, a*), (a1, a*),
  (b, b*), (b1, b*), (b2, b*)
]);

obj.some_function()
  .a().a1().a2()
  .b()
  .c().c1()
calebcartwright commented 2 years ago

Going to weigh in as I feel like this is starting to veer off topic and/or to duplicative discussions.

Fundamentally, there's a decent size portion of our user base that wants more control over how chains are formatted. That's well understood, and is something that's been shared/reported across various mediums, and something we've already conveyed we intend to research (e.g. #4306). There's a few different knobs/toggles folks have requested/suggested, but it's important to not assume any such request/suggestion, including this one, is mutually exclusive with any other request/suggestion.

This issue was specifically about a request for a config option that's based on the number of chain items as opposed to just the existing max_width/chain_width options. As such, I'd ask that future discussion on this issue be kept focused to the request of the issue, and not conflated with discussion of other requests; even if this option were to come to fruition, it would not inherently block/inhibit any other chain-related requests.

As for this particular issue, I'm maintaining the priority as "low", just as it's been labeled for last ~3.5 years. That's because I view it as a different side of the same coin of the current underlying issue where you'd end up with significantly shorter chains getting wrapped with longer ones being collapsed (exaggerated for visualization):

abcdef
    .g
    .h

foooooooooooooooo(123456).baaaar()

Chain formatting is hard, likely the most complex construct, and given the rather limited bandwidth of the rustfmt team I want to make sure that anything we directly invest around chain formatting optionality goes towards the best/most useful option. I don't think this particular option fits that bill, hence the low priority.

As I've outlined in other chain related issues, I do think there's some routes we can and should explore that would support/utilize the relative chain item position of the input to influence the resultant formatting, and thus provide some form of preservation of the association written by the developer. It's just that it's a fairly hard problem, particularly with idempotency as Nick noted years ago, especially with the typical < 30 minutes per day I'm able to carve out for open source.

If you're interested in getting updates on the specific option requested in this issue, be sure to subscribe to it. Or if you're very keen on seeing the specific option requested in this issue, consider submitting a PR. If you're interested in updates on the other option(s) I've described, be sure to subscribe to those respective issues (e.g. #4306). Updates will be posted as and when there are updates to share. An absence of posted updates means that there are no updates to share, so there's no reason, nor value, in "update request" types of comments.

Finally,

May I know what are the special considerations that rustfmt cannot follow what gofmt had done: leave the line break and deal with blank line and indention only? so the implication is that there IS NOT an easy fix for this, coz it requires a fundamental change, or even a totally new project?

This is both been discussed at length previously (on GitHub issues, forums, etc.) and is off topic for this issue regardless. All I'll repeat here is that Rust is not Go, and rustfmt is not gofmt. There's generally two styles of automated codeformatters, and all the major formatters tend to fall into one of those two styles (rustfmt and gofmt are not in the same style). Each style has its set of tradeoffs, and different users have a different preference between those two styles and tradeoffs, but rustfmt can't just "be more like gofmt" any more than gofmt could just "be more like rustfmt"

From what I remember, rustfmt parses the code into an Abstract Syntax Tree, not a Concrete Syntax Tree (possibly even reusing the compiler's parser), so information about the presence or absence of whitespace between tokens (i.e. outside strings and comments) is discarded during the parsing phase.

Correct, rustfmt asks rustc's internal parser module to give rustfmt the AST of the input program, and rustfmt then works with that AST. However, I remain cautiously optimistic about our ability to add some kind of new option since the spans of the relevant AST nodes for chains and backing source map do contain the necessary information.

ragnese commented 2 years ago

@calebcartwright

Respectfully, I think your example shows that you are "stuck" in the mindset of character length.

abcdef
    .g
    .h

foooooooooooooooo(123456).baaaar()

As I understand it, you wrote this example as a kind of "counter example" to show why limiting by number of calls and/or character length is hard.

But it's not any kind of counter-example for me- and I suspect others requesting this feature will agree. The code you wrote above is formatted perfectly. It's exactly what I would want. The reason we are requesting knobs to chop down chains based on number of calls is precisely because we don't care about the character length.

I can't help but think that maybe you and/or other devs are over thinking this feature. I can't speak for anyone else, but I would be absolutely thrilled if there were two options: max-number-of-calls-in-chain and max-character-length-of-chain where max-number-of-calls-in-chain is applied first.

I know that others in the comments have suggested preserving intentional whitespace, which does seem like a different feature request than this one or my recently closed one.

I think a KISS approach I described above would satisfy a lot of people.

calebcartwright commented 2 years ago

@ragnese - Your comment gives me the impression you didn't actually thoroughly read what I said, nor the issues linked within.

To recap what's already been stated above:

If you feel strongly enough about this particular feature then I'd suggest you consider submitting a Pull Request. However, I will ask that you be mindful of your choice of words and try to keep the dialog/tone constructive should you decide to do so and/or continue commenting on this thread. Passively insulting me/other devs because you disagree with our explanation of priority isn't terribly productive nor will it result in us prioritizing your preferred feature.

ragnese commented 2 years ago

@calebcartwright

I didn't intend to come across as rude or insulting. Nor did I actually disagree with your explanation of the priority of this issue or other chain-related issues, so I'm not sure where that part of the response came from. I do understand how my wording could come across as critical, but it wasn't intended to criticize anybody's character or intellect. (I don't consider the suggestion that someone might be stuck in a certain mindset on a technical matter to be character assassination, but that could just be some social ineptness on my part. If this is indeed the case, then I sincerely apologize.)

In any case, most of my point was to engage with the example you supplied. You're quite correct that I haven't recently read through three years' comments on related issues, and I'm happy to admit that I'm quite ignorant of the whole project. And, perhaps I misunderstood your wording in the previous comment, but when I read this:

That's because I view it as a different side of the same coin of the current underlying issue where you'd end up with significantly shorter chains getting wrapped with longer ones being collapsed (exaggerated for visualization):

it sounded to me as though the following example was intended to be an example of bad formatting. Since that formatting is not undesirable to me as someone who had a similar/duplicate issue, I thought it was worth letting the relevant people know that.

calebcartwright commented 2 years ago

then I sincerely apologize. You're quite correct that I haven't recently read through three years' comments on related issues, and I'm happy to admit that I'm quite ignorant of the whole project

No worries, and no harm done. I don't want to dwell on this and take the thread any more off topic, however, I will note that in such a situation it would probably be best to first gather that broader context or at least ask clarifying questions.

I don't think it's fair to start with a critique of people especially without having relevant context, and especially when critiquing in that style. In my experience comments that have to open with the likes of "respectfully", "with all due respect", "no offense but", etc. are almost always followed by (or at least often reasonably interpreted as) something quite the opposite.

it sounded to me as though the following example was intended to be an example of bad formatting

Nope, and I think you may have focused in too much on the specific contents of that one snippet and in doing so are missing the overall point. As noted previously this has all been discussed at length in various places but I'll recap again here.

I love the first sentence in the description of this issue because it quite succinctly captures the fundamental problem at the (original) focus of this issue, which also happens to be the main overall challenge a portion of our user base has with chains:

I like to split my chains when they seem „complex enough“.

The Rust Style Guide, and thus rustfmt, apply the formatting rules deterministically and categorically, including whether chains are wrapped or flattened/unwrapped. That applies to both default and non-default values for configuration options. However, this doesn't work for many developers because whether or not a chain looks better wrapped/flattened to them "depends"; it's highly subjective, non-deterministic, and often context dependent based on the specifics of the particular chain and where they are writing it.

That's the biggest and broadest challenge our users have reported with chain formatting, and that is where the rustfmt team is focused. As a potential solution to that problem, some users have suggested being able to control the wrapping/flattening based on the number of elements in the chain. Perhaps a smaller number, including yourself from the sound of it, specifically want that feature regardless of the problem statement.

As I said above, we think that by-number-of-elements proposal is not a good solution to the underlying problem, and is thus not a priority. That's because it's still top-level categorical option lacking local context, and we'd anticipate that most folks that would take advantage of that hypothetical option would still run into the same underlying problem where they liked/accepted it for some chains but found it unacceptable for other chains (it may just be different chain occurrences than they had with width based)

With code formatting considerations it's important to think about small/short examples in the broader set of possibilities/potential occurrences. My previous example snippet that you've zoomed in on was shown as an example that the underlying subjectivity and contextual problem still exists; it's not about whether you or I like/dislike those specific two chains at that specific indentation level.

The real gist is that any categorical option (including by-number-of-chain-elements) will still cause plenty of cases where, to use the parlance of the issue description, "complex enough" chains the developer would subjectively prefer be wrapped across multiple lines would still get flattened to a single line, while conversely "simple" chains would still get multilined. That "simple/complex enough" perception is in the eye of the beholder regardless of whether it stems from chains consisting of nested tuple/field access, embedded block style comments, calls with more complex args (e.g. nested chains, closures, large struct lits, etc.), and my example snippet was not an attempt to enumerate a list that exhaustively covered that range subjectivity.

So it's not that we're "stuck" on anything, it's just that we're focused on providing the right solution for the problem at hand.

marziply commented 1 year ago

Just checking this issue hasn't been forgotten about... What's the status of this?

ytmimi commented 1 year ago

This is on hold for now. See the discussion in the closed PR (https://github.com/rust-lang/rustfmt/pull/5514) for more details. I believe we first want to address https://github.com/rust-lang/rustfmt/issues/4306 before revisiting this as a potential formatting option.

DavidAntliff commented 1 year ago

Probably a naive and simplistic suggestion, but could a directive, prior to the chain, hint to rustfmt the desired behaviour?

Something like:

    #[rustfmt::chain("vertical")]
    obj
        .some_function()
        .a()
        .b()
        .c()

I've used #[rustfmt::skip] to preserve vertical chains as-written, but this is a bit of a sledgehammer, as for long, complex chains with embedded closures, it's nice to let rustfmt still format the sub-sections.

EDIT: never mind, I see from the linked issues that the situation with chains is quite complicated and nuanced, and my suggestion would only work for one (albeit somewhat major) use case.

hecatia-elegua commented 1 year ago

Just a very quick question: is there a rustfmt.toml option to at least turn off chain formatting completely? If not, would it be reasonably easy to implement? I would do it with a tiny bit of guidance.

ytmimi commented 1 year ago

@hecatia-elegua currently there is no option to specifically turn off chain formatting. #[rustfmt::skip] is your best bet.