llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.68k stars 11.86k forks source link

Finish / clean up MapFile implementation #50033

Open int3 opened 3 years ago

int3 commented 3 years ago
Bugzilla Link 50689
Version unspecified
OS All
CC @gkmhub,@int3,@thevinster,@nico,@carbon-steel,@smeenai,@TH3CHARLie

Extended Description

These are probably good tasks for someone looking to get familiarized with the LLD codebase, since the mapfile isn't critical to what we are working on at the moment. (Please ping me first before starting on these.)

  1. Dump dead-stripped symbols, as per this TODO: https://github.com/llvm/llvm-project/blob/main/lld/MachO/MapFile.cpp#L153. Essentially, symbols for which isLive() returns false.

  2. getSectionSyms() puts all the symbols into a map of section -> symbols, but this seems unnecessary. This was likely copied from the ELF port, which prints a section header before the list of symbols it contains. But the Mach-O map file doesn't print these headers.

  3. Parallelize the symbol sort. We can use LLVM's parallel_sort for this. If two symbols have the same address, we can use their symbol name to tie-break; the end result should be deterministic.

  4. Dump the cstring / fixed-width literals (from CStringInputSection and WordLiteralInputSection respectively) into the map file.

To see the expected map file output, download the tar attachment from llvm/llvm-project#48001 . Unpack and link it like so: ld -map mapfile @​response.txt. This will generate a "mapfile" file in the CWD. This mapfile will contain examples dead symbols and literals (grep for "literal string" in the file).

That said, this file is pretty large (since Chromium is pretty large), so you may want to construct smaller test programs for a better understanding.

nico commented 3 years ago

They shouldn't be in the "normal" output but in a separate section at the end:

% cat test.c void dead() {} int main() {} % clang test.c -Wl,-map,test.txt -Wl,-dead_strip % cat test.txt

Path: a.out

Arch: x86_64

Object files:

[ 0] linker synthesized [ 1] /var/folders/qt/hxckwtm545l643cnk200wzt00000gn/T/test-18287b.o

Sections:

Address Size Segment Section

0x100003FB0 0x00000008 TEXT text 0x100003FB8 0x00000048 TEXT unwind_info

Symbols:

Address Size File Name

0x100003FB0 0x00000008 [ 1] _main 0x100003FB8 0x00000048 [ 0] compact unwind info

Dead Stripped Symbols:

Size File Name

<> 0x00000010 [ 1] _dead <> 0x00000018 [ 1] CIE

TH3CHARLie commented 3 years ago

I find this in the test-suite: https://github.com/llvm/llvm-project/blob/eb237ffca821839374574b2195c865765ebf5d09/lld/test/MachO/dead-strip.s#L10

Does this imply that we should not output any dead stripped symbols when having the -dead_stirp option?

int3 commented 3 years ago

I think we can just always dump the dead-stripped symbols. I would check ld64's output to make sure we match, but I'm not aware of any CLI options that would affect this behavior

TH3CHARLie commented 3 years ago

Hi Jez! As for the first TODO item, are we going to dump dead-stripped symbols blindly or there's a need to check some kind of CLI commands?

TH3CHARLie commented 3 years ago

done! It now lives at https://reviews.llvm.org/D104346

int3 commented 3 years ago

Sure, put your diff up and I'll have a look at it :)

TH3CHARLie commented 3 years ago

Hi Jez! I want to work on No.3 and I've done implementation locally, though I am not sure if the current test case is sufficient. If you are Ok with it, I can send a patch with my code changes and we can discuss what kind of test case we need.

int3 commented 3 years ago

assigned to @carbon-steel

carbon-steel commented 2 years ago

https://reviews.llvm.org/D114737 added the dead stripped symbols section like Nico suggested.

carbon-steel commented 2 years ago

https://reviews.llvm.org/D114735 finished task 2 of this bug.

carbon-steel commented 2 years ago

With respect the word literals, both lld and ld64 appear to be doing the same thing: test.s

.comm _number, 1
  .globl _main
  _main:
    ret

ld64.lld $ bin/ld64.lld -platform_version macos 10 11 -arch x86_64 /Users/rgr/local/llvm-project/build/Debug/tools/lld/test/MachO/Output/map-file.s.tmp/test.o -map /tmp/lld-map -o /dev/null

# Symbols:
# Address           File  Name
0x100000318     [  1] _main
0x100001000     [  1] _number

ld64 $ ld -map /tmp/ld64-map /tmp/test.o -o /dev/null -L/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/lib -lSystem

# Symbols:
# Address       Size            File  Name
0x100003FB7     0x00000001      [  1] _main
0x100003FB8     0x00000048      [  0] compact unwind info
0x100004000     0x00000001      [  1] _number

So, I won't make any changes to how lld deals with word literals.

llvmbot commented 1 year ago

Hi!

This issue may be a good introductory issue for people new to working on LLVM. If you would like to work on this issue, your first steps are:

1) Assign the issue to you. 2) Fix the issue locally. 3) Run the test suite locally. 3.1) Remember that the subdirectories under test/ create fine-grained testing targets, so you can e.g. use make check-clang-ast to only run Clang's AST tests. 4) Create a git commit 5) Run git clang-format HEAD~1 to format your changes. 6) Submit the patch to Phabricator. 6.1) Detailed instructions can be found here

For more instructions on how to submit a patch to LLVM, see our documentation.

If you have any further questions about this issue, don't hesitate to ask via a comment on this Github issue.

@llvm/issue-subscribers-good-first-issue