rust-lang / rustfmt

Format Rust code
https://rust-lang.github.io/rustfmt/
Apache License 2.0
6.03k stars 889 forks source link

Rustfmt silently removes comments after enum/struct field #5464

Open martin-t opened 2 years ago

martin-t commented 2 years ago

Examples:

enum Enum{
    Variant { field /*comment*/ : i32}
}

struct S {
    field /* comment */ : i32
}

Both comments get removed without any warning.

Since this seems to be a recurring issue (#5297, #4708, #2781, https://github.com/rust-lang/rust/issues/52444), would it be possible to find some (temporary) middle ground, such as emitting warnings? Can rustfmt get a list of all comments in the code before and after formatting to compare if any went missing?

ede1998 commented 1 year ago

Stumbled upon a similar issue:

❯ rustfmt --version
rustfmt 1.5.1-stable (897e375 2022-11-02)

❯ rustfmt --emit stdout <( echo 'fn test() -> (/*ads*/) {}' )
/dev/fd/63:
fn test() -> () {}
martin-t commented 1 year ago

Ran into another one of these:

trait Trait<T> // comment
where
    T: Clone,
{
}

Here, "comment" is also removed. This is not even a weird place to put a comment. I had a FIXME there which was important and after formatting my code it was gone. I noticed only by luck.

ytmimi commented 1 year ago

Ran into another one of these:

trait Trait<T> // comment
where
   T: Clone,
{
}

@martin-t I'm pretty sure this is a duplicate of #4649

NishantJoshi00 commented 3 weeks ago

Hey @ytmimi , is there any development being done fixing this issue?

ytmimi commented 3 weeks ago

@NishantJoshi00 I don't think anyone has picked this up. If you're interested in working on it you can comment @rustbot claim to assign yourself.

NishantJoshi00 commented 3 weeks ago

@rustbot claim

NishantJoshi00 commented 3 weeks ago

Hey @ytmimi,

I've been going through the codebase, familiarizing myself with how the data flows through the system. Along the way, these are some of the observations that I made:

  1. While itemizing named fields, the key and the value aren't stored separately but instead stored as a single string. While this does keep the whole [ListItem] fairly simple, it makes it harder to deal with anything that might exist in between.
  2. This issue occurs when the [ListItem] is constructed. This happens in the [Iterator] implementation of the [ListItems] struct. (This is what I am assuming, correct me if I'm wrong)

Now, looking at how we can solve this, a few ideas come to mind:

  1. We can consider changing the type definition of RewriteResult to something like: pub(crate) type RewriteResult<T = String> = Result<T, RewriteError>; This, while not affecting most of the code, could help solve the issue by having more explicit types.

    However, this comes with its own set of shortcomings: a. This type information will most likely leak into the ListItem and ListItems structs. b. Although we can have defaults for this, this level of information might not be needed in all cases. I might need to dig a bit deeper here, but this is one of the approaches I could think of.

  2. Another way to solve this would be to treat the key and value as separate items. This might cause some level of code restructuring, but this would allow us to address the issue. While not having any changes done to the data models. (Probably)
  3. Just to question the fundamentals: do we even intend to encourage inline comments between an item of format key and value? If not, then this issue boils down to more of a linting issue than a formatting issue.

I'd love to get your thoughts on this before starting any substantial code changes.

ytmimi commented 3 weeks ago
  1. We can consider changing the type definition of RewriteResult to something like: pub(crate) type RewriteResult<T = String> = Result<T, RewriteError>; This, while not affecting most of the code, could help solve the issue by having more explicit types.

Interesting idea, but I don't think we need to make such a change to tackle this issue. Could be interesting as a standalone PR.


  1. Just to question the fundamentals: do we even intend to encourage inline comments between an item of format key and value? If not, then this issue boils down to more of a linting issue than a formatting issue.

I agree that it's an odd place for a comment. I don't think we want to encourage it, but we can definitely do better than silently removing the comment. One thing we could try is simply not rewriting the struct field or enum variant if a comment would be removed using recover_comment_removed. That's what we currently do for statements and expressions. Here's the code for expressions:

https://github.com/rust-lang/rustfmt/blob/a2625bfa4b9950b496f9869b7b4b0c3e43c49022/src/expr.rs#L426-L427


  1. Another way to solve this would be to treat the key and value as separate items. This might cause some level of code restructuring, but this would allow us to address the issue. While not having any changes done to the data models. (Probably)

An alternative that I can think of is to make changes to rewrite_struct_field_prefix to try and recover the comment since that's where it's getting lost.

https://github.com/rust-lang/rustfmt/blob/a2625bfa4b9950b496f9869b7b4b0c3e43c49022/src/items.rs#L1914-L1929

One wrinkle I can think of is how should we format line comments and multi-line block comments?

enum Enum{
    Variant1 { field /* multi-
                     * line
                     * comment*/ : i32},
    Variant2 { field // A line comment
                    : i32}
}