Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

lld/mac mislinks binaries (protoc, chromium, ...) on arm with --deduplicate-literals #49760

Open Quuxplusone opened 3 years ago

Quuxplusone commented 3 years ago
Bugzilla Link PR50791
Status NEW
Importance P enhancement
Reported by Nico Weber (nicolasweber@gmx.de)
Reported on 2021-06-21 18:55:05 -0700
Last modified on 2021-06-23 19:13:11 -0700
Version unspecified
Hardware PC All
CC gkm@fb.com, jezreel@gmail.com, llvm-bugs@lists.llvm.org, smeenai@fb.com
Fixed by commit(s)
Attachments csd.proto (52585 bytes, text/x-csrc)
cstring-alignment.s (771 bytes, text/plain)
Blocks
Blocked by
See also
Repro:
https://drive.google.com/file/d/1RJe4CRLk3_ULWolz4xZApy6XIJvxMl1R/view?usp=sharing

Link using the response file, then run like so:

./protoc csd.proto --cpp_out .

Then look at csd.pb.h in the output. Line 45 will say

  static const ::r_PROTO_NAMESPACE_ID::internal::ParseTableField entries[]

There's a \0 character in the middle of the line, and the line's all wrong.

Remove --deduplicate-literals from that line, link again, and that line looks
like

  static const ::PROTOBUF_NAMESPACE_ID::internal::ParseTableField entries[]

like it's supposed to do.
Quuxplusone commented 3 years ago

Attached csd.proto (52585 bytes, text/x-csrc): csd.proto

Quuxplusone commented 3 years ago

I haven't debugged this, but my guess is that this might be due to the alignment thing.

If so, maybe it'd be good if we did the MinAlign thing for cstring subsections too -- keep track of the __cstring section alignment, compute MinAlign(sectionAlign, stringOffset) for each string, and dedup towards larger alignment.

This will require either some surgery to StringTableBuilder, or moving off of it.

Jez, want to take a look at this?

Quuxplusone commented 3 years ago

Yup, I'll take a look at it

Quuxplusone commented 3 years ago

Extending this[1] alignment hack to arm64 fixes the problem, so it's definitely an alignment issue.

https://github.com/llvm/llvm-project/blob/main/lld/MachO/SyntheticSections.cpp#L1180

If so, maybe it'd be good if we did the MinAlign thing for cstring subsections too -- keep track of the __cstring section alignment, compute MinAlign(sectionAlign, stringOffset) for each string, and dedup towards larger alignment.

Yeah, I guess we should do that.

Quuxplusone commented 3 years ago

Attached cstring-alignment.s (771 bytes, text/plain): Test case

Quuxplusone commented 3 years ago

I have no strong preference here, so feel free to ignore, but abstractly I would expect that sorting by alignment has close to no effect on output size. So the least surprising behavior to me would be to keep strings in the order they are in the input and just omit dupes that appear later and slide strings after omitted strings left until the hole is gone (but leave enough of a hole so that the slid string is on MinAlign(sectionAlign, stringInputAlign)).

Quuxplusone commented 3 years ago

Well the sorting was so that dedup'ing toward the larger alignment would be easier to implement.

However, that scheme doesn't seem right. I am running into segfaults while running LLD-linked cppcheck. Input file order doesn't seem to be the right thing either. I'm a bit stumped as to how ld64 is handling this...

Quuxplusone commented 3 years ago

I am tempted to extend the 16-byte-alignment-hack to arm64 as well and just leave it that way for now. We should find a proper fix eventually, but this doesn't seem super high-pri.

Will have another look later tonight before deciding.

Quuxplusone commented 3 years ago

https://drive.google.com/file/d/1vH1Loe_y3AgaXvx1X-pH0Xf2n7ruryqQ/view?usp=sharing

This is the cppcheck binary that demonstrates the issue. Patching https://reviews.llvm.org/D104833, linking the repro, and running the testrunner binary will cause a segfault due to a misaligned string access.