llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.83k stars 11.91k forks source link

[lld][coff] spurious duplicate symbols with delayload on x86 #107371

Open glandium opened 1 month ago

glandium commented 1 month ago

STR:

Expected result: success

Actual result:

lld-link: error: duplicate symbol: __declspec(dllimport) __load_timeBeginPeriod
lld-link: error: duplicate symbol: __declspec(dllimport) __load_timeEndPeriod
lld-link: error: duplicate symbol: __declspec(dllimport) __load_CoInitializeEx
lld-link: error: duplicate symbol: __declspec(dllimport) __load_GetClientRect

Removing -DELAYLOAD:ole32.dll, -DELAYLOAD:user32.dll, and -DELAYLOAD:winmm.dll from the response file makes the issue go away.

What's peculiar in this link is that there are COFF import files in gkrust.lib that repeats symbols from other .libs. Those come from rust's raw-dylib feature. It somehow makes the linker trip with the delayload, but only on x86, not x86_64 or aarch64.

Cc: @mstorsjo

llvmbot commented 1 month ago

@llvm/issue-subscribers-lld-coff

Author: Mike Hommey (glandium)

STR: - Download https://drive.google.com/file/d/1AESRw2pf8lbMwfKl4S238UlOBqf1YHCR/view?usp=sharing - Unpack the archive and enter the `repro` directory. - Run `lld-link @response.txt` Expected result: success Actual result: ``` lld-link: error: duplicate symbol: __declspec(dllimport) __load_timeBeginPeriod lld-link: error: duplicate symbol: __declspec(dllimport) __load_timeEndPeriod lld-link: error: duplicate symbol: __declspec(dllimport) __load_CoInitializeEx lld-link: error: duplicate symbol: __declspec(dllimport) __load_GetClientRect ``` Removing `-DELAYLOAD:ole32.dll`, `-DELAYLOAD:user32.dll`, and `-DELAYLOAD:winmm.dll` from the response file makes the issue go away. What's peculiar in this link is that there are COFF import files in gkrust.lib that repeats symbols from other `.lib`s. Those come from rust's raw-dylib feature. It somehow makes the linker trip with the delayload, but only on x86, not x86_64 or aarch64. Cc: @mstorsjo
mstorsjo commented 1 month ago

Thanks, reproed.

Regarding rust and embedded import libraries, there are also other known issues around that, when some build systems want to link such a library with --whole-archive, which doesn't really work for the COFF short import library files - see https://github.com/msys2/MINGW-packages/issues/21017#issuecomment-2178912819 for a discussion on that.

glandium commented 1 month ago

So it turns out there's something fishy in the windows-targets crate, which makes rust generate symbols that are different from the symbols in the real import library, and that confuses lld because both symbols have the same external name, leading the the duplicate symbol error, which is, in fact, confusing in itself considering what's going on.

So, for instance, the real ole32.lib has __imp__CoInitializeEx@8 but gkrust.lib has __imp_CoInitializeEx. The external name for both is CoInitializeEx, which is what DelayLoadContent::create uses when it creates the synthetic symbol, which is does for both __imp__CoInitializeEx@8 and __imp_CoInitializeEx, leading to a conflict in the synthetic symbol name.

This is 32-bits only because on 64-bits windows, both ole32.lib and gkrust.lib have __imp_CoInitializeEx.

If I remove the import_name_type = "undecorated" from https://github.com/microsoft/windows-rs/blob/ff6f56df6c9d727c348b3c4cd6d45db59f76fa1b/crates/libs/targets/src/lib.rs#L9, gkrust.lib gets __imp__CoInitializeEx@8 and the error goes away.

I'm not sure whether what lld does to name the synthetic symbol should be considered a bug or not, but at the very least, the error could be less confusing.

mstorsjo commented 1 month ago

I'm not sure whether what lld does to name the synthetic symbol should be considered a bug or not, but at the very least, the error could be less confusing.

If you'd manage to produce a minimal, lld testcase sized reproduction of the issue, it'd be interesting to study where the duplicate symbol name comes from. (It's been a few years since I touched delayloading last time, so I'm not entirely up to date with what symbols are created there.)

So it turns out there's something fishy in the windows-targets crate, which makes rust generate symbols that are different from the symbols in the real import library, and that confuses lld because both symbols have the same external name, leading the the duplicate symbol error, which is, in fact, confusing in itself considering what's going on.

So, for instance, the real ole32.lib has __imp__CoInitializeEx@8 but gkrust.lib has __imp_CoInitializeEx. The external name for both is CoInitializeEx, which is what DelayLoadContent::create uses when it creates the synthetic symbol, which is does for both __imp__CoInitializeEx@8 and __imp_CoInitializeEx, leading to a conflict in the synthetic symbol name.

This is 32-bits only because on 64-bits windows, both ole32.lib and gkrust.lib have __imp_CoInitializeEx.

If I remove the import_name_type = "undecorated" from https://github.com/microsoft/windows-rs/blob/ff6f56df6c9d727c348b3c4cd6d45db59f76fa1b/crates/libs/targets/src/lib.rs#L9, gkrust.lib gets __imp__CoInitializeEx@8 and the error goes away.

Hmm, I don't think removing the import name type undecorated is the right thing to do here - but i think that whatever tool rust uses to produce those import libraries is wrong.

Let's have a closer look at the import library entry in the original WinSDK ole32.lib for this function:

$ llvm-nm ole32.lib | grep -A1 -B1 CoInitializeEx
ole32.dll:
00000000 T _CoInitializeEx@8
00000000 T __imp__CoInitializeEx@8

$ llvm-readobj ole32.lib | grep -A1 -B5 CoInitializeEx

File: ole32.dll
Format: COFF-import-file-i386
Type: code
Name type: undecorate
Export name: CoInitializeEx
Symbol: __imp__CoInitializeEx@8
Symbol: _CoInitializeEx@8

So on the object file level, the import library has the symbols __imp__CoInitializeEx@8 and _CoInitializeEx@8, but the entry is marked with the name type undecorate. This means that when you link against it, it should refer to it with the decorations stripped, i.e. Export name: CoInitializeEx which is what gets imported from the other DLL in the end.

Now if we do the same on the rust provided import library, we see this:

$ llvm-nm gkrust.lib | grep -A1 -B1 CoInitializeEx
ole32.dll:
00000000 T CoInitializeEx
00000000 T __imp_CoInitializeEx
$ llvm-readobj gkrust.lib | grep -A1 -B5 CoInitializeEx

File: ole32.dll
Format: COFF-import-file-i386
Type: code
Name type: name
Export name: CoInitializeEx
Symbol: __imp_CoInitializeEx
Symbol: CoInitializeEx

So this doesn't really do the right thing at all. Other object file references to this symbol would be referring to _CoInitializeEx@8 and __imp__CoInitializeEx@8, while this only provides the symbols CoInitializeEx and __imp_CoInitializeEx. Removing the undecorate probably isn't right either, because them you probably retain the @8, but then you'd also end up trying to import a symbol named CoInitializeEx@8 from the DLL, which doesn't exist.

So I think the import name type undecorate here is correct, but rust's import library generation for such types is broken.

glandium commented 1 month ago

Well, removing the undecorate from windows-targets seems to make rust do the right thing.

Here's how a C++ file refers to CoInitializeEx:

         U __imp__CoInitializeEx@8

Here's how gkrust.lib refers to CoInitializeEx:

         U _CoInitializeEx@8
         U __imp__CoInitializeEx@8

Here's how gkrust.lib refers to it with undecorate:

         U _CoInitializeEx@8
         U __imp_CoInitializeEx

(I'm not sure what's up with the non-_imp_ symbols, they're from a different .o)

glandium commented 1 month ago

(I'm not sure what's up with the non-imp symbols, they're from a different .o)

Oh, they come from a crate that uses winapi, so which doesn't use raw-dylib.

mstorsjo commented 1 month ago

Well, removing the undecorate from windows-targets seems to make rust do the right thing.

For link time behaviour, possibly yes, but I doubt that your executable will start at runtime.

What does llvm-readobj say about the import library entry for __imp__CoInitializeEx@8, in particular the Export name field, after that change?

glandium commented 1 month ago

What does llvm-readobj say about the import library entry for impCoInitializeEx@8, in particular the Export name field, after that change?

DelayImport {
(snip)
  Import {
    Symbol: _CoInitializeEx@8 (0)
    Address: 0x15BF43CC
  }

There isn't an Export name field in llvm-readobj --coff-imports output.

glandium commented 1 month ago

I guess the most important part is this: it's indeed not the same as when not using raw-dylibs at all:

  Import {
    Symbol: CoInitializeEx (0)
    Address: 0x15C035C9
  }
mstorsjo commented 1 month ago

What does llvm-readobj say about the import library entry for impCoInitializeEx@8, in particular the Export name field, after that change?

DelayImport {
(snip)
  Import {
    Symbol: _CoInitializeEx@8 (0)
    Address: 0x15BF43CC
  }

I presume this is for the linked binary? Yeah this won't load at runtime, but as this is a delayed import, you won't notice until you actually try to call the function. There's no exported _CoInitializeEx@8 in ole32.dll.

There isn't an Export name field in llvm-readobj --coff-imports output.

I meant to just run llvm-readobj gkrust.lib | grep -A1 -B5 CoInitializeEx, which you can do directly when gkrust.lib exists, without needing to link the final executable. It should show the same in any case, that you end up with Export name: _CoInitializeEx@8 rather than Export name: CoInitializeEx (which is the right thing). So while linking with the import library may seem to work for you, it won't work at runtime. (You can try this by doing a minimal dummy executable that just tries to call CoInitializeEx(), and link that against the gkrust.lib import library, and try to execute the binary as well.)

mstorsjo commented 1 month ago

I guess the most important part is this: it's indeed not the same as when not using raw-dylibs at all:

Yeah I don't really know about the rust/raw-dylibs aspects, but if the import lib entry in gkrust.lib, as shown with llvm-readobj gkrust.lib | grep -A1 -B5 CoInitializeEx, differs in any way from the same command prints for WinSDK's ole32.lib, then it is wrong, which may manifest at either link time or runtime.

glandium commented 1 month ago

At this point, I think the only thing I'd consider a bug on the LLVM side is lld-link's duplicate symbol message is lacking an indication of origin of the duplicate symbol in this specific corner case.