Open glandium opened 1 month ago
@llvm/issue-subscribers-lld-coff
Author: Mike Hommey (glandium)
Thanks, reproed.
Regarding rust and embedded import libraries, there are also other known issues around that, when some build systems want to link such a library with --whole-archive
, which doesn't really work for the COFF short import library files - see https://github.com/msys2/MINGW-packages/issues/21017#issuecomment-2178912819 for a discussion on that.
So it turns out there's something fishy in the windows-targets crate, which makes rust generate symbols that are different from the symbols in the real import library, and that confuses lld because both symbols have the same external name, leading the the duplicate symbol error, which is, in fact, confusing in itself considering what's going on.
So, for instance, the real ole32.lib has __imp__CoInitializeEx@8
but gkrust.lib has __imp_CoInitializeEx
. The external name for both is CoInitializeEx
, which is what DelayLoadContent::create
uses when it creates the synthetic symbol, which is does for both __imp__CoInitializeEx@8
and __imp_CoInitializeEx
, leading to a conflict in the synthetic symbol name.
This is 32-bits only because on 64-bits windows, both ole32.lib and gkrust.lib have __imp_CoInitializeEx
.
If I remove the import_name_type = "undecorated"
from https://github.com/microsoft/windows-rs/blob/ff6f56df6c9d727c348b3c4cd6d45db59f76fa1b/crates/libs/targets/src/lib.rs#L9, gkrust.lib gets __imp__CoInitializeEx@8
and the error goes away.
I'm not sure whether what lld does to name the synthetic symbol should be considered a bug or not, but at the very least, the error could be less confusing.
I'm not sure whether what lld does to name the synthetic symbol should be considered a bug or not, but at the very least, the error could be less confusing.
If you'd manage to produce a minimal, lld testcase sized reproduction of the issue, it'd be interesting to study where the duplicate symbol name comes from. (It's been a few years since I touched delayloading last time, so I'm not entirely up to date with what symbols are created there.)
So it turns out there's something fishy in the windows-targets crate, which makes rust generate symbols that are different from the symbols in the real import library, and that confuses lld because both symbols have the same external name, leading the the duplicate symbol error, which is, in fact, confusing in itself considering what's going on.
So, for instance, the real ole32.lib has
__imp__CoInitializeEx@8
but gkrust.lib has__imp_CoInitializeEx
. The external name for both isCoInitializeEx
, which is whatDelayLoadContent::create
uses when it creates the synthetic symbol, which is does for both__imp__CoInitializeEx@8
and__imp_CoInitializeEx
, leading to a conflict in the synthetic symbol name.This is 32-bits only because on 64-bits windows, both ole32.lib and gkrust.lib have
__imp_CoInitializeEx
.If I remove the
import_name_type = "undecorated"
from https://github.com/microsoft/windows-rs/blob/ff6f56df6c9d727c348b3c4cd6d45db59f76fa1b/crates/libs/targets/src/lib.rs#L9, gkrust.lib gets__imp__CoInitializeEx@8
and the error goes away.
Hmm, I don't think removing the import name type undecorated
is the right thing to do here - but i think that whatever tool rust uses to produce those import libraries is wrong.
Let's have a closer look at the import library entry in the original WinSDK ole32.lib for this function:
$ llvm-nm ole32.lib | grep -A1 -B1 CoInitializeEx
ole32.dll:
00000000 T _CoInitializeEx@8
00000000 T __imp__CoInitializeEx@8
$ llvm-readobj ole32.lib | grep -A1 -B5 CoInitializeEx
File: ole32.dll
Format: COFF-import-file-i386
Type: code
Name type: undecorate
Export name: CoInitializeEx
Symbol: __imp__CoInitializeEx@8
Symbol: _CoInitializeEx@8
So on the object file level, the import library has the symbols __imp__CoInitializeEx@8
and _CoInitializeEx@8
, but the entry is marked with the name type undecorate
. This means that when you link against it, it should refer to it with the decorations stripped, i.e. Export name: CoInitializeEx
which is what gets imported from the other DLL in the end.
Now if we do the same on the rust provided import library, we see this:
$ llvm-nm gkrust.lib | grep -A1 -B1 CoInitializeEx
ole32.dll:
00000000 T CoInitializeEx
00000000 T __imp_CoInitializeEx
$ llvm-readobj gkrust.lib | grep -A1 -B5 CoInitializeEx
File: ole32.dll
Format: COFF-import-file-i386
Type: code
Name type: name
Export name: CoInitializeEx
Symbol: __imp_CoInitializeEx
Symbol: CoInitializeEx
So this doesn't really do the right thing at all. Other object file references to this symbol would be referring to _CoInitializeEx@8
and __imp__CoInitializeEx@8
, while this only provides the symbols CoInitializeEx
and __imp_CoInitializeEx
. Removing the undecorate
probably isn't right either, because them you probably retain the @8
, but then you'd also end up trying to import a symbol named CoInitializeEx@8
from the DLL, which doesn't exist.
So I think the import name type undecorate
here is correct, but rust's import library generation for such types is broken.
Well, removing the undecorate
from windows-targets seems to make rust do the right thing.
Here's how a C++ file refers to CoInitializeEx:
U __imp__CoInitializeEx@8
Here's how gkrust.lib refers to CoInitializeEx:
U _CoInitializeEx@8
U __imp__CoInitializeEx@8
Here's how gkrust.lib refers to it with undecorate
:
U _CoInitializeEx@8
U __imp_CoInitializeEx
(I'm not sure what's up with the non-_imp_
symbols, they're from a different .o)
(I'm not sure what's up with the non-imp symbols, they're from a different .o)
Oh, they come from a crate that uses winapi, so which doesn't use raw-dylib.
Well, removing the
undecorate
from windows-targets seems to make rust do the right thing.
For link time behaviour, possibly yes, but I doubt that your executable will start at runtime.
What does llvm-readobj
say about the import library entry for __imp__CoInitializeEx@8
, in particular the Export name
field, after that change?
What does llvm-readobj say about the import library entry for impCoInitializeEx@8, in particular the Export name field, after that change?
DelayImport {
(snip)
Import {
Symbol: _CoInitializeEx@8 (0)
Address: 0x15BF43CC
}
There isn't an Export name
field in llvm-readobj --coff-imports
output.
I guess the most important part is this: it's indeed not the same as when not using raw-dylibs at all:
Import {
Symbol: CoInitializeEx (0)
Address: 0x15C035C9
}
What does llvm-readobj say about the import library entry for impCoInitializeEx@8, in particular the Export name field, after that change?
DelayImport { (snip) Import { Symbol: _CoInitializeEx@8 (0) Address: 0x15BF43CC }
I presume this is for the linked binary? Yeah this won't load at runtime, but as this is a delayed import, you won't notice until you actually try to call the function. There's no exported _CoInitializeEx@8
in ole32.dll.
There isn't an
Export name
field inllvm-readobj --coff-imports
output.
I meant to just run llvm-readobj gkrust.lib | grep -A1 -B5 CoInitializeEx
, which you can do directly when gkrust.lib
exists, without needing to link the final executable. It should show the same in any case, that you end up with Export name: _CoInitializeEx@8
rather than Export name: CoInitializeEx
(which is the right thing). So while linking with the import library may seem to work for you, it won't work at runtime. (You can try this by doing a minimal dummy executable that just tries to call CoInitializeEx()
, and link that against the gkrust.lib
import library, and try to execute the binary as well.)
I guess the most important part is this: it's indeed not the same as when not using raw-dylibs at all:
Yeah I don't really know about the rust/raw-dylibs aspects, but if the import lib entry in gkrust.lib
, as shown with llvm-readobj gkrust.lib | grep -A1 -B5 CoInitializeEx
, differs in any way from the same command prints for WinSDK's ole32.lib, then it is wrong, which may manifest at either link time or runtime.
At this point, I think the only thing I'd consider a bug on the LLVM side is lld-link's duplicate symbol message is lacking an indication of origin of the duplicate symbol in this specific corner case.
STR:
repro
directory.lld-link @response.txt
Expected result: success
Actual result:
Removing
-DELAYLOAD:ole32.dll
,-DELAYLOAD:user32.dll
, and-DELAYLOAD:winmm.dll
from the response file makes the issue go away.What's peculiar in this link is that there are COFF import files in gkrust.lib that repeats symbols from other
.lib
s. Those come from rust's raw-dylib feature. It somehow makes the linker trip with the delayload, but only on x86, not x86_64 or aarch64.Cc: @mstorsjo