Inconsistent formating of long method call chains

RalfJung commented 5 years ago

In the following example, the first long method call chain is kept at the outer indentation level, while the second is indented by one:

fn very_very_very_very_very_long_fun_name(x: i32) -> Vec<i32> {
    vec![x]
}

fn main() {
    let very_very_very_very_very_very_very_very_very_long_var_name = 13;
    let all = very_very_very_very_very_long_fun_name(
        very_very_very_very_very_very_very_very_very_long_var_name,
    )
    .iter()
    .map(|x| x + very_very_very_very_very_very_long_var_name);
    let more = 13;
    let other = vec![1, 2, 3]
        .iter()
        .map(|x| x + very_very_very_very_very_very_long_var_name);
}

I think the first chain is formatted quite badly; everything that's part of let all should be indented by one to indicate that it is part of the same statement.

nrc commented 5 years ago

I turned out to be really hard to get consistently great formatting for chains. Although this is not great, I think the alternatives were worse.

RalfJung commented 5 years ago

Personally I think the following is better (and it is what I use in my projects):

fn main() {
    let very_very_very_very_very_very_very_very_very_long_var_name = 13;
    let all = very_very_very_very_very_long_fun_name(
            very_very_very_very_very_very_very_very_very_long_var_name,
        )
        .iter()
        .map(|x| x + very_very_very_very_very_very_long_var_name);
    let more = 13;
    let other = vec![1, 2, 3]
        .iter()
        .map(|x| x + very_very_very_very_very_very_long_var_name);
}

ozkriff commented 5 years ago

p-low? This looks like a really big issue to me because it mangles not-so-rare code structures like:

let some_value = 1;
let other_value = 1;
let _a = vec![
    // Nicely structured and readable:
    StructA { test_test: 0 }
        .aaa_aaa()
        .bbb_bbb()
        .do_stuff(StructB { test_test_b: 1 })
        .ccc_ccc(),
    StructA { test_test: 0 }
        .bbb_bbb()
        .do_stuff(StructB { test_test_b: 1 })
        .aaa_aaa()
        .do_stuff(StructB { test_test_b: 1 })
        .ccc_ccc(),

    // Now we make initial struct literal a little bit longer
    // and suddenly it becomes _really hard_ to read.
    // Even with this simple demo names it's hard to see when one item ends
    // and the next one starts.
    StructA {
        test_test: some_value,
    }
    .aaa_aaa()
    .bbb_bbb()
    .do_stuff(StructB {
        test_test_b: other_value,
    })
    .ccc_ccc(),
    StructA {
        test_test: some_value,
    }
    .bbb_bbb()
    .aaa_aaa()
    .do_stuff(StructB {
        test_test_b: other_value,
    })
    .ccc_ccc(),
    StructA {
        test_test: some_value,
    }
    .do_stuff(StructB {
        test_test_b: other_value,
    })
    .aaa_aaa()
    .do_stuff(StructB {
        test_test_b: other_value,
    })
    .bbb_bbb()
    .ccc_ccc(),
];

Playground

jturner314 commented 5 years ago

I just tried running rustfmt on a project that I haven't reformatted in a while. These are some more examples showcasing this issue (copying the indentation from the original source):

            // example
            let B_fac = (Lambda_sqrt_inv.dot(test_covariance).dot(&Lambda_sqrt_inv)
                + Array2::<f64>::eye(D))
            .factorize()
            .context("failed to factor B")?;

                // example
                let R_fac = (test_covariance.dot(&Array2::from_diag(
                    &(&inv_sq_len_scales_i + &inv_sq_len_scales_j),
                )) + Array2::<f64>::eye(D))
                .factorize()
                .context("failed to factor R")?;

                // example
                let L: Array2<_> = (ki_plus_kj
                    + mahalanobis_sq_dist(&ii, &(-&ij), (R_inv.dot(test_covariance) / 2.).view()))
                .mapv_into(f64::exp);

        // example
        let dCdm: Array3<_> = (tensordot(&Cdm, &Mdm.slice(s![i, ..]), 1)
            + tensordot(&Cds, &Sdm.slice(s![i, i, ..]), 2))
        .into_dimensionality()
        .unwrap();
        let dCds: Array4<_> = (tensordot(&Cdm, &Mds.slice(s![i, .., ..]), 1)
            + tensordot(&Cds, &Sds.slice(s![i, i, .., ..]), 2))
        .into_dimensionality()
        .unwrap();

            // example
            let dSdm: Array3<_> = tensordot(
                &tensordot(&PP, &pd_S2_wrt_aug_mean, 2),
                &d_aug_mean_wrt_mean,
                1,
            )
            .into_dimensionality()
            .unwrap();
            let dSdv: Array4<_> = tensordot(
                &tensordot(&PP, &pd_S2_wrt_aug_cov, 2),
                &d_aug_cov_wrt_cov,
                2,
            )
            .into_dimensionality()
            .unwrap();
            let dCdm: Array3<_> = tensordot(
                &tensordot(&PQ, &pd_C2_wrt_aug_mean, 2),
                &d_aug_mean_wrt_mean,
                1,
            )
            .into_dimensionality()
            .unwrap();
            let dCdv: Array4<_> = tensordot(
                &tensordot(&PQ, &pd_C2_wrt_aug_cov, 2),
                &d_aug_cov_wrt_cov,
                2,
            )
            .into_dimensionality()
            .unwrap();

scampi commented 5 years ago

The examples in https://github.com/rust-lang/rustfmt/issues/3157#issuecomment-492405284 and https://github.com/rust-lang/rustfmt/issues/3157#issuecomment-472718887 would need some better heuristics in order to have a better formatting, but I feel like the original issue with the indented calls as shown in https://github.com/rust-lang/rustfmt/issues/3157#issuecomment-434588413 would be good to have. What do you think @topecongiro ? We could split this issue in two: address the original issue first, then review the formatting of chain calls in a separate issue.

Below would be the expected formatting:

fn main() {
    // the var name and the chain calls are indented
    let all = very_very_very_very_very_long_fun_name(
            very_very_very_very_very_very_very_very_very_long_var_name,
        )
        .iter()
        .map(|x| x + very_very_very_very_very_very_long_var_name);
    // here no need to indent
    let all = very_very_very_very_very_long_fun_name(
        very_very_very_very_very_very_very_very_very_long_var_name,
    );
}

calebcartwright commented 4 years ago

I believe I may have found a solution to this as part of the work I've been doing with chains on #3863

Still working on it, but what I've got currently is producing the suggested output for the snippet in the OP, and I've also included the output for the other snippets shared in the other comments

https://github.com/rust-lang/rustfmt/issues/3157#issuecomment-472718887

```rust ... fn main() { let some_value = 1; let other_value = 1; let _a = vec![ // Nicely structured and readable: StructA { test_test: 0 } .aaa_aaa() .bbb_bbb() .do_stuff(StructB { test_test_b: 1 }) .ccc_ccc(), StructA { test_test: 0 } .bbb_bbb() .do_stuff(StructB { test_test_b: 1 }) .aaa_aaa() .do_stuff(StructB { test_test_b: 1 }) .ccc_ccc(), // Now we make initial struct literal a little bit longer // and suddenly it becomes _really hard_ to read. // Even with this simple demo names it's hard to see when one item ends // and the next one starts. StructA { test_test: some_value, } .aaa_aaa() .bbb_bbb() .do_stuff(StructB { test_test_b: other_value, }) .ccc_ccc(), StructA { test_test: some_value, } .bbb_bbb() .aaa_aaa() .do_stuff(StructB { test_test_b: other_value, }) .ccc_ccc(), StructA { test_test: some_value, } .do_stuff(StructB { test_test_b: other_value, }) .aaa_aaa() .do_stuff(StructB { test_test_b: other_value, }) .bbb_bbb() .ccc_ccc(), ]; } ```

https://github.com/rust-lang/rustfmt/issues/3157#issuecomment-492405284

Still working on these, the first few remain but the latter half can now be indented ```rust ... // example let B_fac = (Lambda_sqrt_inv.dot(test_covariance).dot(&Lambda_sqrt_inv) + Array2::::eye(D)) .factorize() .context("failed to factor B")?; // example let R_fac = (test_covariance.dot(&Array2::from_diag( &(&inv_sq_len_scales_i + &inv_sq_len_scales_j), )) + Array2::::eye(D)) .factorize() .context("failed to factor R")?; // example let L: Array2<_> = (ki_plus_kj + mahalanobis_sq_dist(&ii, &(-&ij), (R_inv.dot(test_covariance) / 2.).view())) .mapv_into(f64::exp); // example let dCdm: Array3<_> = (tensordot(&Cdm, &Mdm.slice(s![i, ..]), 1) + tensordot(&Cds, &Sdm.slice(s![i, i, ..]), 2)) .into_dimensionality() .unwrap(); let dCds: Array4<_> = (tensordot(&Cdm, &Mds.slice(s![i, .., ..]), 1) + tensordot(&Cds, &Sds.slice(s![i, i, .., ..]), 2)) .into_dimensionality() .unwrap(); // example let dSdm: Array3<_> = tensordot( &tensordot(&PP, &pd_S2_wrt_aug_mean, 2), &d_aug_mean_wrt_mean, 1, ) .into_dimensionality() .unwrap(); let dSdv: Array4<_> = tensordot( &tensordot(&PP, &pd_S2_wrt_aug_cov, 2), &d_aug_cov_wrt_cov, 2, ) .into_dimensionality() .unwrap(); let dCdm: Array3<_> = tensordot( &tensordot(&PQ, &pd_C2_wrt_aug_mean, 2), &d_aug_mean_wrt_mean, 1, ) .into_dimensionality() .unwrap(); let dCdv: Array4<_> = tensordot( &tensordot(&PQ, &pd_C2_wrt_aug_cov, 2), &d_aug_cov_wrt_cov, 2, ) .into_dimensionality() .unwrap(); } ```

AFAICT the original non-indented formatting will occur whenever the parent chain item requires multiple lines. I believe it's possible to format chains that have a multi-line parent element similarly to how single-line parent chains are formatted, though I do have a few questions.

Should the formating behavior be configurable? Should users be able to choose between the current, non indented option and indented?
Examples

    let all = very_very_very_very_very_long_fun_name(
        very_very_very_very_very_very_very_very_very_long_var_name,
    )
    .iter()
    .map(|x| x + very_very_very_very_very_very_long_var_name);

or indented like

    let all = very_very_very_very_very_long_fun_name(
            very_very_very_very_very_very_very_very_very_long_var_name,
        )
        .iter()
        .map(|x| x + very_very_very_very_very_very_long_var_name);

How/when should the parent chain item contents be indented? Should it be configurable and/or conditional dependening on what the parent chain element item is/contains?
Examples

Parent never indented

    let all = very_very_very_very_very_long_fun_name(
        very_very_very_very_very_very_very_very_very_long_var_name,
    )
        .iter()
        .map(|x| x + very_very_very_very_very_very_long_var_name);

    StructA {
        test_test: some_value,
    }
        .do_stuff(StructB {
            test_test_b: other_value,
        })
        .aaa_aaa()
        .do_stuff(StructB {
            test_test_b: other_value,
        })
        .bbb_bbb()
        .ccc_ccc()

Or parent always indented

    let all = very_very_very_very_very_long_fun_name(
            very_very_very_very_very_very_very_very_very_long_var_name,
        )
        .iter()
        .map(|x| x + very_very_very_very_very_very_long_var_name);

    StructA {
            test_test: some_value,
        }
        .do_stuff(StructB {
            test_test_b: other_value,
        })
        .aaa_aaa()
        .do_stuff(StructB {
            test_test_b: other_value,
        })
        .bbb_bbb()
        .ccc_ccc()

or a mix of the two..

What I've got currently is inspecting the parent chain item to determine whether or not to indent. For example, it won't indent a multi-line parent item if the parent chain element is a call/method call-and it contains an arg that's a struct lit, string lit, or closure, but the subsequent chain items will still be indented (just like if the parent could fit on a single line)

I did so as I found those conditions avoided mangling certain args and produced what IMO was a better result. For example, when the parent chain element includes an arg that's a closure with a large many-line body

This: ```rust fn main() { foo(|x| { // imagine this is a really long closure body // and you can't see the end and .unwrap() without scrolling // .... }) .unwrap() } ``` Would be indented to this without the conditions: ```rust fn main() { foo(|x| { // imagine this is a really long closure body // and you can't see the end and .unwrap() without scrolling // .... }) .unwrap() } ``` And I found the double indentation to be confusing at first glance.

The conditional parent element indentation results in this:

fn main() {
    let all = very_very_very_very_very_long_fun_name(
            very_very_very_very_very_very_very_very_very_long_var_name,
        )
        .iter()
        .map(|x| x + very_very_very_very_very_very_long_var_name);

    StructA {
        test_test: some_value,
    }
        .do_stuff(StructB {
            test_test_b: other_value,
        })
        .aaa_aaa()
        .do_stuff(StructB {
            test_test_b: other_value,
        })
        .bbb_bbb()
        .ccc_ccc()

    foo(|x| {
        // ....
    })
        .bar()
        .baz()
        .unwrap()
}

Does that look/sound reasonable?

RalfJung commented 4 years ago

@calebcartwright thanks a lot for looking into this!

I mostly agree with you, but I think this looks odd:

    StructA {
        test_test: some_value,
    }
        .do_stuff(StructB {
            test_test_b: other_value,
        })

The closing curly brace of the StructA constructor is kind of sitting in a weird place here. IMO this looks better:

    StructA {
            test_test: some_value,
        }
        .do_stuff(StructB {
            test_test_b: other_value,
        })

scampi commented 4 years ago

@RalfJung I might be the oddball but I like the current formatting... It clearly separates the field of the struct from the method call, especially if you have many fields/calls.

calebcartwright commented 4 years ago

What I've been working on leverages a new config option to support a few different flavors of alignment (I've left the default as the current formatting) to provide flexibility.

Hope to have the proposed solution finished in the next couple days for review

fgimian commented 1 year ago

Hey there folks, I hope you're all doing well. This is causing quite poor readability on my codebase which heavily uses the builder pattern. Is there any workaround as of today to disable this behaviour in rustfmt?

Thanks heaps Fotis

madonuko commented 2 months ago

I think the current chain formatting algorithm is very very misleading:

pub fn list() -> Res<impl Iterator<Item = Self>> {
    Ok(std::io::BufReader::new(
        std::fs::File::open(PathBuf::from("/proc/mounts"))
            .map_err(|e| crate::LsblkError::ReadFile("/proc/mounts".into(), e))?,
    )
    .lines()
    .map_while(Result::ok)
    .filter_map(|l| {
        let mut parts = l.trim_end_matches(" 0 0").split(' ');
        Some(Self {
            device: parts.next()?.into(),
            mountpoint: parts.next()?.into(),
            fstype: parts.next()?.into(),
            mountopts: parts.next()?.into(),
        })
    }))
}

This leads people (some 4 to 5 people on the Discord community) to believe .lines() is a method of Ok(...). This issue should straight up be renamed to "Misleading formatting of method call chains".

Anyway, I feel like in https://github.com/rust-lang/rustfmt/issues/3157#issuecomment-549051713, both styles solve the issue, so maybe rustfmt could include both and allow tweaking it in the config file? Personally though I prefer the first one because the first ending } indicates where the "main" expression ends before the chain while the second one isn't as clear.

rust-lang / rustfmt

Inconsistent formating of long method call chains #3157