llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.53k stars 11.79k forks source link

lld-link LTO fails on symbol references to symbols within inline assembly within functions #76046

Open mstorsjo opened 10 months ago

mstorsjo commented 10 months ago

LTO builds of FFmpeg on Windows on x86 fail, because LLD believes that the link has unresolved symbols, so it errors out before proceeding to the actual LTO compile (where those symbols would appear). This has been reported to FFmpeg at https://trac.ffmpeg.org/ticket/10548. The original, problematic piece of code can be viewed at https://github.com/FFmpeg/FFmpeg/blob/n6.1/libavcodec/x86/mlpdsp_init.c.

The issue can be reproduced with a very minimal reduced case that looks like this:

extern char func_pos;

static const void *const jumptable[] = { &func_pos };

void func(void) {
  __asm__ volatile(
    "func_pos:\n"
  :: "r"(jumptable)
  );
}

Attempting to build and link it with LTO:

$ clang -target x86_64-windows-msvc -c lto-repro.c -flto
$ lld-link lto-repro.o -out:lto-repro.exe -entry:func -subsystem:console
lld-link: error: undefined symbol: func_pos
>>> referenced by lto-repro.c
>>>               lto-repro.o

Without the option -flto, linking succeeds.

The same experiment also succeeds with Clang when targeting Linux:

$ clang -target x86_64-linux-gnu -c lto-repro.c  -flto
$ ld.lld lto-repro.o -o lto-repro.exe -e func

The same experiment also succeeds with GCC, for both Linux and Windows targets.

The root issue is that the inline assembly block, within the function func, defines a symbol func_pos, which the C code outside of it refers to, but the LTO object file does not indicate that the symbol func_pos will be defined by the inline assembly:

$ clang -target x86_64-windows-msvc -c lto-repro.c -flto
$ llvm-nm lto-repro.o
---------------- T func
                 U func_pos
---------------- d jumptable

This looks exactly the same when Clang targets Linux as well.

The reason why this fails on Windows, is that before starting the actual LTO compilation, lld-link checks whether there are unresolved symbols, to be able to print errors to the developer immediately if it looks like the link is going to fail, instead of having to wait for the full LTO compilation to complete, just to print errors: https://github.com/llvm/llvm-project/blob/llvmorg-17.0.6/lld/COFF/Driver.cpp#L2374-L2379 In the ELF version of LLD, there's nothing corresponding (probably because most linked ELF objects are allowed to have undefined symbols?).

These links do succeed if just the pre-LTO symbol availability check is skipped.

In order to facilitate symbol discovery from LTO objects, the LLVM LTO object file implementation does actually peek into the inline assembly statements, but only for module level inline assembly, not for function internal inline assembly:

__asm__(
  "static_func:\n"
  ".globl extern_func\n"
  "extern_func:\n"
  "  call undefined_sym\n"
);

void func(void) {
  __asm__(
    "inline_static_func:\n"
    ".globl inline_extern_func\n"
    "inline_extern_func:\n"
    "  call inline_undefined_sym\n"
  );
}
$ clang -target x86_64-linux-gnu -c lto-symbols.c -flto
$ llvm-nm lto-symbols.o
---------------- T extern_func
---------------- T func
---------------- t static_func
                 U undefined_sym

Here, both the defined and undefined symbols referenced in the module level inline assembly are picked up, but not the ones from the function level inline assembly.

CC @MaskRay

llvmbot commented 10 months ago

@llvm/issue-subscribers-lld-coff

Author: Martin Storsjö (mstorsjo)

LTO builds of FFmpeg on Windows on x86 fail, because LLD believes that the link has unresolved symbols, so it errors out before proceeding to the actual LTO compile (where those symbols would appear). This has been reported to FFmpeg at https://trac.ffmpeg.org/ticket/10548. The original, problematic piece of code can be viewed at https://github.com/FFmpeg/FFmpeg/blob/n6.1/libavcodec/x86/mlpdsp_init.c. The issue can be reproduced with a very minimal reduced case that looks like this: ```c extern char func_pos; static const void *const jumptable[] = { &func_pos }; void func(void) { __asm__ volatile( "func_pos:\n" :: "r"(jumptable) ); } ``` Attempting to build and link it with LTO: ``` $ clang -target x86_64-windows-msvc -c lto-repro.c -flto $ lld-link lto-repro.o -out:lto-repro.exe -entry:func -subsystem:console lld-link: error: undefined symbol: func_pos >>> referenced by lto-repro.c >>> lto-repro.o ``` Without the option `-flto`, linking succeeds. The same experiment also succeeds with Clang when targeting Linux: ``` $ clang -target x86_64-linux-gnu -c lto-repro.c -flto $ ld.lld lto-repro.o -o lto-repro.exe -e func ``` The same experiment also succeeds with GCC, for both Linux and Windows targets. The root issue is that the inline assembly block, within the function `func`, defines a symbol `func_pos`, which the C code outside of it refers to, but the LTO object file does not indicate that the symbol `func_pos` will be defined by the inline assembly: ``` $ clang -target x86_64-windows-msvc -c lto-repro.c -flto $ llvm-nm lto-repro.o ---------------- T func U func_pos ---------------- d jumptable ``` This looks exactly the same when Clang targets Linux as well. The reason why this fails on Windows, is that before starting the actual LTO compilation, lld-link checks whether there are unresolved symbols, to be able to print errors to the developer immediately if it looks like the link is going to fail, instead of having to wait for the full LTO compilation to complete, just to print errors: https://github.com/llvm/llvm-project/blob/llvmorg-17.0.6/lld/COFF/Driver.cpp#L2374-L2379 In the ELF version of LLD, there's nothing corresponding (probably because most linked ELF objects are allowed to have undefined symbols?). These links do succeed if just the pre-LTO symbol availability check is skipped. In order to facilitate symbol discovery from LTO objects, the LLVM LTO object file implementation does actually peek into the inline assembly statements, but only for module level inline assembly, not for function internal inline assembly: ```c __asm__( "static_func:\n" ".globl extern_func\n" "extern_func:\n" " call undefined_sym\n" ); void func(void) { __asm__( "inline_static_func:\n" ".globl inline_extern_func\n" "inline_extern_func:\n" " call inline_undefined_sym\n" ); } ``` ``` $ clang -target x86_64-linux-gnu -c lto-symbols.c -flto $ llvm-nm lto-symbols.o ---------------- T extern_func ---------------- T func ---------------- t static_func U undefined_sym ``` Here, both the defined and undefined symbols referenced in the module level inline assembly are picked up, but not the ones from the function level inline assembly. CC @MaskRay
kasper93 commented 10 months ago

I have also another concern about this address of label from asm exercises. In mlpdsp_init.c case the function is called by pointer, initialized in init function. But in general case we have to be careful about inlining and cloning of function in question. Sorry if it is too far away from the topic of this issue, but I wanted to highlight that.

For example if we don't use asm and use && https://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html compiler knows what is happening and can add additional constraints. Otherwise it is not so clear.

The &&foo expressions for the same label might have different values if the containing function is inlined or cloned. If a program relies on them being always the same, attribute((noinline,noclone)) should be used to prevent inlining and cloning. If &&foo is used in a static variable initializer, inlining and cloning is forbidden.

rnk commented 9 months ago

I don't know what ffmpeg thinks they are doing, but jumping into labels mid inline asm block is definitely not something that LLVM can support in the general case. I'm sure with enough attributes and glue ffmpeg can make things work for now, but it's super fragile, and I don't think we should expend any effort trying to support this use case.

mstorsjo commented 9 months ago

I don't know what ffmpeg thinks they are doing, but jumping into labels mid inline asm block is definitely not something that LLVM can support in the general case. I'm sure with enough attributes and glue ffmpeg can make things work for now, but it's super fragile, and I don't think we should expend any effort trying to support this use case.

Yeah, overall I also agree that this construct is extremely fragile and nothing I would condone as such.

In a more general sense though - for LTO, we do scan module level assembly snippets for potential symbols, to expose them as symbols on the IR object file, but we don't, for assembly snippets within a function. I wonder if it would make sense to do that still.

For symbols defined within the inline assembly, like this case, I don't think there's any well defined case where that would be valuable - but in the example above, exposing inline_undefined_sym as undefined on the IR object could make sense. Perhaps not for function calls within the inline assembly, but for accessing data it could make sense. (Then again, one could access the symbol on the C level and expose it to the inline assembly - and I don't have a concrete case where this is needed.)

amyspark commented 4 months ago

Hi @mstorsjo -- isn't this the exact issue we at GStreamer had found in https://github.com/llvm/llvm-project/issues/64127#issuecomment-1694765228 ?

mstorsjo commented 4 months ago

Hi @mstorsjo -- isn't this the exact issue we at GStreamer had found in #64127 (comment) ?

Yes, that's the exact same issue - this issue is a reduced version of that one.

See the upstream fix in https://github.com/ffmpeg/FFmpeg/commit/52bfee90187183be4a3c5fe17433699b45370a5e which avoids using the code with inline asm labels with LTO on windows targets.

lhmouse commented 2 days ago

This bites me today: https://github.com/lhmouse/mcfgthread/blob/ba324f7f1307a0576597c6e9e64de1c7fd663671/mcfgthread/gthr_aux.c#L14-L137

However this is not a label defined with in a function; the label is defined in a module-level assembly block. In other words it's a function defined in assembly and is meant to be called from C.

mstorsjo commented 2 days ago

This bites me today: https://github.com/lhmouse/mcfgthread/blob/ba324f7f1307a0576597c6e9e64de1c7fd663671/mcfgthread/gthr_aux.c#L14-L137

However this is not a label defined with in a function; the label is defined in a module-level assembly block. In other words it's a function defined in assembly and is meant to be called from C.

As mentioned in the original post here, LLVM does seem to peek into module level assembly and detect symbols from there. Can you have a look with llvm-nm on the LTO object file, to see which labels are visible and which aren't?

lhmouse commented 2 days ago

@mstorsjo The function that is defined in assembly is __MCF_gthr_call_once_seh_take_over. It appears twice, being both defined and undefined 😂 :

$ llvm-nm ./libmcfgthread-minimal-1.dll.p/mcfgthread_gthr_aux.c.obj
                 U _MCF_mutex_lock_slow
                 U _MCF_mutex_unlock_slow
                 U _MCF_once_abort
                 U _MCF_once_release
                 U _MCF_once_wait_slow
                 U _MCF_shared_mutex_lock_exclusive_slow
                 U _MCF_shared_mutex_lock_shared_slow
                 U _MCF_shared_mutex_unlock_slow
---------------- T __MCF_gthr_call_once_seh
                 U __MCF_gthr_call_once_seh_take_over
---------------- T __MCF_gthr_call_once_seh_take_over
---------------- T __MCF_gthr_mutex_relock_callback
---------------- T __MCF_gthr_mutex_unlock_callback
---------------- T __MCF_gthr_rc_mutex_init
---------------- T __MCF_gthr_rc_mutex_recurse
---------------- T __MCF_gthr_rc_mutex_release
---------------- T __MCF_gthr_rc_mutex_wait
---------------- T __MCF_gthr_recursive_mutex_relock_callback
---------------- T __MCF_gthr_recursive_mutex_unlock_callback
---------------- T __MCF_gthr_shared_mutex_relock_exclusive_callback
---------------- T __MCF_gthr_shared_mutex_relock_shared_callback
---------------- T __MCF_gthr_shared_mutex_unlock_callback
---------------- T __MCF_gthr_thread_thunk_v2
---------------- T __MCF_gthr_timeout_from_timespec
---------------- T __MCF_gthr_unonce
---------------- t do_call_once_seh_uhandler

And more importantly, ".section \".drectve\" \n\t.ascii \" -export:\\\"__MCF_gthr_call_once_seh_take_over\\\"\" \n\t" seems lost. The symbol is not exported from the DLL, causing linker errors for test programs.