psf / black

The uncompromising Python code formatter
https://black.readthedocs.io/en/stable/
MIT License
38.71k stars 2.43k forks source link

Reflows of inline comments can cause misalignment #379

Open quodlibetor opened 6 years ago

quodlibetor commented 6 years ago

Operating system: macOS Python version: 3.6 Black version: 18.6b4 Does also happen on master: yes

edit: moved original first example to end because it's reasonable behavior

With this (example 2), black moves everything down one so it is aligned incorrectly:

# original
revert("!" +
       one(2) +         # extension
       three(4) +       # length
       one(flapping) +  # flapping info present
       two(duration) +  # duration
       two(0))

becomes:

revert(
    "!"
    + one(2)
    + three(4)  # extension
    + one(flapping)  # length
    + two(duration)  # flapping info present
    + two(0)  # duration
)

example 1 (reasonable behavior, original text preserved):

# original
a = [1,
     2, # very important
     3, # also important
     4, # less important, but still worth reading
     5] # this is just for the line length

Black moves the last comment to a weird place, while handling most other lines correctly:

# formatted
a = [
    1,
    2,  # very important
    3,  # also important
    4,  # less important, but still worth reading
    5,
]  # this is just for the line length
carljm commented 6 years ago

Thanks for the report!

My feeling is that your first example is "working as designed" and probably can't be reasonably fixed. The last comment lives outside the brackets, and it's reasonable for Black to keep it outside the brackets; how is Black supposed to know that it "goes with" the argument on that line rather than being a comment on the whole bracketed expression? It's not hard to move after reformatting, and then Black will keep it there.

The second example might be fixable, I'm not sure without looking into it more closely.

quodlibetor commented 6 years ago

Thanks that's totally reasonable, I didn't consider that POV but I'm fine with it.

The second example seems like a genuine bug (not an enhancement), though, whether or not it's fixable.

carljm commented 6 years ago

Judgment call I guess; I call it an enhancement because all comment placement is kind of best-effort by heuristic, there will never be a solution that everyone finds optimal for all cases. Even in that case it's easy to see what Black is doing: it's keeping the comment on the same line as the same point in the AST. So the comment # extension occurs in the middle of + three(4) in the original code, and ends up on the same line as that same spot in the reformatted code. It may be possible to improve the heuristic to do a bit better in this case (e.g. privilege association with name nodes in the AST vs operators, or something), but it'll need some care to see that we don't make other scenarios worse by doing so. A bug is where Black is clearly not behaving according to documented or promised behavior (e.g. not preserving AST equivalence), not where some formatting is just less-than-ideal.

quodlibetor commented 6 years ago

Ah, yeah that makes a lot of sense. In my head this was just an off-by-one error, but operating at the level of the ast obviously makes that a bunch more complicated.

Thanks for the detailed response!

blaiseli commented 6 years ago

I think I'm observing something related:

Original:

ALI2READ_INFO = attrgetter(
    "query_name",
    "query_sequence",
    "query_qualities",  # or qual ?
    "query_length")

Reformatted:

ALI2READ_INFO = attrgetter(
    "query_name", "query_sequence", "query_qualities", "query_length"  # or qual ?
)

The # or qual ? comment is not placed in a relevant manner any more.

zsol commented 6 years ago

I think it's impossible for black to distinguish between these two:

a = [
    1,
    2]  # the list `a` is important
a = [
    1,
    2]  # 2 is important

So it's unlikely that the formatted output will satisfy readers of both code snippets. So now we're talking about trading one group of people's confusion to another's unfortunately. I think putting the comment in the right place to begin with is the right call here (i.e. after 2, not after the list)

@blaiseli your example is more similar to what was brought up here and as @ambv said there, is intentional.

blaiseli commented 6 years ago

@zsol Indeed, it is more similar. I haven't understood the reason behind the intention, though.

zsol commented 6 years ago

Because the black code style follows the "if it fits in one line, it should be one line" rule. This takes priority over comment placement.

If it didn't fit, black would've split each argument into its own line, and then it has a chance to preserve the comment location (based on where it was in the original AST).

quodlibetor commented 6 years ago

@zsol I'm more concerned about example 2 which does not have anything to do with the closing delimiter. Your comment suggests that you closed this because of the other example, which I agree is impossible for black to disambiguate.

# original
revert("!" +
       one(2) +         # extension
       three(4) +       # length
       one(flapping) +  # flapping info present
       two(duration) +  # duration
       two(0))

becomes:

revert(
    "!"
    + one(2)
    + three(4)  # extension
    + one(flapping)  # length
    + two(duration)  # flapping info present
    + two(0)  # duration
)
zsol commented 6 years ago

Ah you're right, I got confused because example 2 is the first one now :) I concur with @carljm for this one

max-sixty commented 6 years ago

This is somewhat tangential, but I wanted to solicit whether this was expected behavior and if so whether I should start a new issue.

Without a comment, formatted with black (-l 79)

    return (
        df_day.assign(date=date, fund_ticker=ticker)
        .pipe(lambda x: x.loc[x["Shares"].notnull()])
        .assign(Shares=df_day["Shares"].astype("float64"))
    )

Adding a comment, before formatting

    return (
        df_day.assign(date=date, fund_ticker=ticker)
        .pipe(lambda x: x.loc[x["Shares"].notnull()])
        # arrives as int sometimes
        .assign(Shares=df_day["Shares"].astype("float64"))
    )

With a comment, after formatting with black:

    return (
        df_day.assign(date=date, fund_ticker=ticker).pipe(
            lambda x: x.loc[x["Shares"].notnull()]
        )
        # arrives as int sometimes
        .assign(Shares=df_day["Shares"].astype("float64"))
    )

Without thinking about the more general problem, the second example here is ideal - the comment can be within a fluent expression and wouldn't change the fluent expression formatting.

Thanks again for such an awesome library and vision!

ambv commented 6 years ago

@max-sixty, this should be a separate issue. This is weird and should still be fluent. Looks like a bug to me on first read.

arjennienhuis commented 5 years ago

I find this a inconvenient too:

ports = [
    80,
    8080,  # for testing only
    443,
]

becomes:

ports = [80, 8080, 443]  # for testing only
cpennington commented 5 years ago

This issue, and https://github.com/ambv/black/issues/195 could both be fixed by having black pin the comment to the expression at the beginning of the line that the comment was on, rather than the end of the expression. That seems like it could present other pathological cases, but none are jumping to mind at the moment.

ambv commented 4 years ago

Note to self: the problem to solve here is the example in https://github.com/psf/black/issues/379#issuecomment-401044473.

ambv commented 4 years ago

Note to self: type comments have special handling now to make them more sticky. @graingert suggests we use this logic for all comments.

JelleZijlstra commented 1 year ago

I am closing a number of issues as duplicates because they all share the common theme of Black moving comments to unfortunate places. Any solution to this issue should consider the cases mentioned in the duplicates and look for ways to improve them.