Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

lld/mac doesn't dedupe CFSTR()s #50906

Open Quuxplusone opened 3 years ago

Quuxplusone commented 3 years ago
Bugzilla Link PR51939
Status NEW
Importance P enhancement
Reported by Nico Weber (nicolasweber@gmx.de)
Reported on 2021-09-22 08:39:27 -0700
Last modified on 2021-09-22 14:11:40 -0700
Version unspecified
Hardware PC All
CC gkm@fb.com, jezreel@gmail.com, llvm-bugs@lists.llvm.org, smeenai@fb.com
Fixed by commit(s)
Attachments
Blocks
Blocked by
See also
% cat cf1.cc
#include <CoreFoundation/CoreFoundation.h>
void f();
int main() {
  f();
  CFShow(CFSTR("Hello"));
}

% cat cf2.cc
#include <CoreFoundation/CoreFoundation.h>

void f() {
  CFShow(CFSTR("Hello"));
}

% clang -c cf1.cc cf2.cc

CFSTR() gets transformed into __builtin___CFStringMakeConstantString(), which
gets compiled into something like:

    .section    __TEXT,__cstring,cstring_literals
L_.str:                                 ## @.str
    .asciz  "Hello"

    .section    __DATA,__cfstring
    .p2align    3               ## @_unnamed_cfstring_
L__unnamed_cfstring_:
    .quad   ___CFConstantStringClassReference
    .long   1992                    ## 0x7c8
    .space  4
    .quad   L_.str
    .quad   5                       ## 0x5

lld keeps two copies of that (it manages to dedup the "Hello" C string, but not
the CFStr):

% out/gn/bin/clang cf1.o cf2.o -framework CoreFoundation -fuse-ld=lld -Wl,--
deduplicate-literals -Wl,--icf=all
% xxd a.out | grep c807
00002010: c807 0000 0000 0000 ea06 0000 0100 0000  ................
00002030: c807 0000 0000 0000 ea06 0000 0100 0000  ................

ld64 gets it right:

% out/gn/bin/clang cf1.o cf2.o -framework CoreFoundation
% xxd a.out | grep c807
00004010: c807 0000 0000 0000 a63f 0000 0100 0000  .........?......

I vaguely remember Jez mentioning somewhere that he's aware of this (but I
might be making this up, not sure), but I didn't find an existing bug for it.
Quuxplusone commented 3 years ago

(Noticed in https://crbug.com/1251763 -- in the end, turned out to be incorrect code, but incorrect code that happens to never fail with ld64. Also, this is a binary size opportunity.)

Quuxplusone commented 3 years ago

Hmm in theory https://reviews.llvm.org/D105045 should have fixed this, but I guess something's missing...

Quuxplusone commented 3 years ago

Oh, I remember now. D106213 is needed too for the dedup to work here, since right now ICF is hashing the addresses of two different cstrings when trying to dedup the CFStrings.