llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.34k stars 11.7k forks source link

lld is using a buggy way to insert lto.tmp object into objectfile list in addCombinedLTOObject #38738

Open llvmbot opened 5 years ago

llvmbot commented 5 years ago
Bugzilla Link 39390
Version unspecified
OS Windows NT
Reporter LLVM Bugzilla Contributor
CC @isanbard,@rui314,@smithp35,@stephenhines

Extended Description

The current way is to push_back all generated lto.tmp objects into object file list. This way will create wrong order in init_array. The common way is to insert all lto.tmp objects right after the position where the first bitcode object appears in object file list.

smithp35 commented 5 years ago

Yes, current LTO does indeed prevent object name based selectors based on the original file name. There isn't an easy solution to using just the linker as the original mapping of file to section (or even symbol definition) is lost when going through LTO.

The RFC (presentation) proposed that the linker encode the output section name from the linker script in the IR passed to the LTO code-generator.

isanbard commented 5 years ago

This also seems to break linker scripts that use file names as part of the output. For instance, if I have two LTO files with these symbols:

ip6_tables.o: __initcall_3_ip6_tables_init6

x_tables.o __initcall_3_xt_init6

All in the ".initcall6.init" section, and if I use this linker script:

SECTIONS { .initcall6.init : { x_tables.o (.initcall6.init) ip6_tables.o (.initcall6.init) } }

This command line doesn't work:

$ ld.lld -O2 -plugin-opt=thinlto -plugin-opt=-code-model=kernel -plugin-opt=jobs=6 -plugin-opt=-stack-alignment=8 -m elf_x86_64 -T foo.lds -r -o bork.o ip6_tables.o x_tables.o $ objdump -t -j .initcall6.init bork.o

bork.o: file format elf64-x86-64

SYMBOL TABLE: 0000000000000000 l O .initcall6.init 0000000000000008 initcall_3_ip6_tables_init6 0000000000000000 l d .initcall6.init 0000000000000000 .initcall6.init 0000000000000008 l O .initcall6.init 0000000000000008 initcall_3_xt_init6

isanbard commented 5 years ago

I'm assuming the expectation is that this must be equivalent to:

  • Input order of all ELF files as if LTO was not used?

Yes, this is what I would like to see, at least with ThinLTO (I'm not sure if it can be done with full LTO unless all files are LTO'ed). I'll look at the RFC more closely, but the first goal is basically what I need. :-)

I can continue with the workaround I created to ensure the ordering of initialization calls. However, I need to make it faster.

smithp35 commented 5 years ago

I've taken a look at the reproducer.

I think that this is related but separate from the original report. To summarise I think that it supports the suggestion to insert the LTO object at a certain position as this is unrelated to crtbegin and crtend, however I think it will only be of limited help.

Just to make sure I understand: the reproducer is a relocatable link with a linker script, but without a SECTIONS command. Hence the .init.data output section will contain all the .init.data input sections in the order that LLD sees them. This order will be disturbed with LTO, in effect we'll see:

I'm assuming the expectation is that this must be equivalent to:

I don't think that this will be easily fixable in just LLD. Inserting the combined LTO ELF file into the place that would be occupied by the first bitcode could work but only if the following conditions held:

The first condition is possible, although would be difficult to enforce if there were both ELF and Bitcode static libraries. I've no idea about what LTO guarantees.

As in your presentation: https://fosdem.org/2019/schedule/event/llvm_kernel/ I think your linker script solution is probably the best that can be done.

There was a presentation about better support for linker scripts in LTO https://llvm.org/devmtg/2017-10/slides/LTOLinkerScriptsEdlerVonKoch.pdf there was an RFC http://lists.llvm.org/pipermail/llvm-dev/2018-May/123252.html about upstreaming, but I've not seen much traffic on that. It may be that the linux kernel LTO needs some of the features mentioned there.

Note: for anyone else that wants to try the reproducer, the bitcode is incompatible with current trunk. I get: Callsite was not defined with variable arguments! i64 (i8, i1, i1) @​llvm.objectsize.i64.p0i8 LLVM ERROR: Broken module found, compilation aborted!

There is a file called version.txt that gives a svn revision number that you can build a LLD/thin-lto that will succeed.

smithp35 commented 5 years ago

Thanks for the reproducer. I hope to take a look at this after I've finished with pr40277.

isanbard commented 5 years ago

In particular, it seems to get the order of these init functions wrong:

initcall_3_ip6_tables_init6 initcall_3_xt_init6

initcall_3_ip6_tables_init6 uses a symbol that's defined in initcall_3_xt_init6, but the order is switched. (Personally, I think that xt_init should be at a different level than ip6_tables_init.)

isanbard commented 5 years ago

Peter, Linux relies upon the symbol order in .init.data in order to ensure initialization calls are called correctly. LTO appears to break this if ELF files are thrown into the mix.

Here's the command line I'm using:

$ ld.lld -plugin-opt=thinlto -plugin-opt=-code-model=kernel -plugin-opt=jobs=6 -plugin-opt=-stack-alignment=8 -m elf_x86_64 -T .tmp_symversions --reproduce llvm/llvm-project#38738 .tar -r -o vmlinux.o --whole-archive built-in.o

Reproduction file:

https://drive.google.com/file/d/12snDWGGluK9WgfAw2dssYOn4_PwI-EsI/view?usp=sharing

llvmbot commented 5 years ago

Testing crtend file name is one method, however I think it is not generic. It only good for android and linux. which depends on crtend. It is why I was proposing to insert to the first seen bitcode. This strategy works so far the best in our local LTO solution.

There is no perfect solution in my mind because even the bitcode a.o b.o c.o can be in any order or any position.

smithp35 commented 5 years ago

I think that this is more a case that LLD is missing out some custom ordering of .init_array that is either provided via the default linker script or hard-coded. I think that depending on the order of input objects is likely to be too fragile.

Strictly speaking there is no specification that I'm aware of that guarantees that the order of the .init_array matches the order of input objects. Nor is there any specification of where LLD should place the outputs of LTO in the

Having said that it seems like ld.bfd via the default linker script and gold via special case code (when linker script not used) have special case code for crt*.o files in layout.cc.

The default linker script for ld.bfd is: .init_array : { PROVIDE_HIDDEN (init_array_start = .); KEEP ((SORT_BY_INIT_PRIORITY(.init_array.) SORT_BY_INIT_PRIORITY(.ctors.))) KEEP ((.init_array EXCLUDE_FILE (crtbegin.o crtbegin?.o crtend.o crtend?.o ) .ctors)) PROVIDE_HIDDEN (init_array_end = .); }

When LLD isn't given a linker script it does have some code to order by init priority but it does not have any custom code for crtbegin and crtend.

I think that the best solution would be to add these conventions so that the .init_array for crtend is always last regardless of where it is in the object list.

llvmbot commented 5 years ago

for example, ndk crtend_android.o will push 0x0 entry as the last entry in .init_array section. LLD LTO puts lto.tmp as the last object file so the initialization function from lto.tmp in its .init_array will be put after 0x0 entry from crtend_android.o in the final .exe. It will create crash in .exe because the initialization function from lto.tmp will not be executed at all.