Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

clang-format misformats struct pointer member refs after objc method calls #46876

Open Quuxplusone opened 3 years ago

Quuxplusone commented 3 years ago
Bugzilla Link PR47907
Status NEW
Importance P enhancement
Reported by Elly Fong-Jones (ellyjones@chromium.org)
Reported on 2020-10-19 14:26:11 -0700
Last modified on 2020-10-21 13:15:30 -0700
Version unspecified
Hardware PC All
CC bhamiltoncx@gmail.com, djasper@google.com, klimek@google.com, llvm-bugs@lists.llvm.org, nicolasweber@gmx.de
Fixed by commit(s)
Attachments test.mm (42 bytes, text/plain)
Blocks
Blocked by
See also
Created attachment 24078
repro case

When formatting the following code snippet:

  int main() {
    foo([obj method], a->b);
  }

clang-format incorrectly inserts spaces around the arrow:

$ cat test.mm
int main() {
  foo([obj method], a->b);
}

$ clang-format --version
clang-format version 11.0.0 (https://github.com/llvm/llvm-project
eb85e90350e93a64279139e7eca9ca40c8fbf5eb)
$ clang-format test.mm
int main() {
  foo([obj method], a -> b);
}

The test file is attached as well.
Quuxplusone commented 3 years ago

Attached test.mm (42 bytes, text/plain): repro case

Quuxplusone commented 3 years ago

Passing -debug and comparing to int main() { foo(obj, a->b); } we get

M=0 C=0 T=Unknown S=0 F=0 B=0 BK=0 P=43 Name=identifier L=13 PPK=2 FakeLParens= FakeRParens=2 II=0xa73768 Text='b'

here but

M=0 C=1 T=LambdaArrow S=1 F=0 B=0 BK=0 P=150 Name=arrow L=22 PPK=2 FakeLParens= FakeRParens=0 II=0x0 Text='->'

in the simpler thing.

[...], a->b does look a bit like a lambda.

Looks like TT_LambdaArrow is assigned in UnwrappedLineParser::tryToParseLambda() in clang/lib/Format/UnwrappedLineParser.cpp.

That function basically skips a long list of tokens until it hits a {. Maybe it should keep track of if it's currently in a [...], a <...>, a (....), and if it sees unexpected tokens outside these parens, bail out.

There's also

case tok::arrow:
  // This might or might not actually be a lambda arrow (this could be an
  // ObjC method invocation followed by a dereferencing arrow). We might
  // reset this back to TT_Unknown in TokenAnnotator.
  FormatTok->setType(TT_LambdaArrow);
  SeenArrow = true;
  nextToken();
  break;

which was added in https://reviews.llvm.org/D57923 but in this case we should be able to not set the wrong token type in the first place.

Ben, would you be interested in looking at this?

Quuxplusone commented 3 years ago

Yeah, my workaround in D57923 was clearly too specific and we need to re-do the parser for lambda arrows entirely.