skeeto / w64devkit

Portable C and C++ Development Kit for x64 (and x86) Windows
The Unlicense
2.66k stars 185 forks source link

Import library incompatability between MSVC and Binutils (link.exe bug) #135

Open skeeto opened 3 weeks ago

skeeto commented 3 weeks ago

I've discovered a subtle import library incompatibility between MSVC and Binutils which goes back at least a quarter century. It's definitely a bug in MSVC link.exe (at least x86 and x64), but possibly also a bug in Binutils. I haven't gotten to the bottom of it, and probably will not any time soon, but I want to document it publicly for the future, in case anyone else wants to keep digging.


Summary: Linking only Binutils-generated import libraries with link.exe will produce a PE image with a non-terminating import directory. It may crash on load — both Windows and Wine — depending on how the loader was written, and what happens to follow the directory in the image. I can reproduce the issue at least as far back as VS6 (1998). It's likely gone unnoticed so long because virtually nobody does this except for the occasional weirdo like me.

The set up is trivial. First a library definition, lower.def:

LIBRARY lower
EXPORTS
func

Then a program that uses it, upper.c:

__declspec(dllimport) void func(void);
void mainCRTStartup(void) { func(); }

If I use MSVC lib.exe to generate the import library, everything is fine:

$ lib /nologo /def:lower.def
$ cl /nologo upper.c /link /subsystem:console lower.lib

However, if I use Binutils dlltool instead:

$ dlltool -d lower.def -l lower.lib
$ cl /nologo upper.c /link /subsystem:console lower.lib

Then upper.exe is an invalid PE image. dumpbin /imports notices:

Section contains the following imports: ...

<Invalid RVA>
            402130 Import Address Table
            40214C Import Name Table
                 0 time date stamp
          75660001 Index of first forwarder reference

Note the "Invalid RVA". It's observable with objdump -p as well, though it doesn't notice enough to produce an error message:

The Import Tables (interpreted .rdata section contents)
 vma:            Hint    Time      Forward  DLL       First
                 Table   Stamp     Chain    Name      Thunk
 00002104   00002118 00000000 00000000 0000212c 00002000

    DLL Name: lower.dll
    vma:  Hint/Ord Member-Name Bound-To
    2120        1  func

 00002118   00002120 00000000 75660001 0000636e 00002104

The five fields in the last line should have been all zero. Per the PE specification:

The last directory entry is empty (filled with null values), which indicates the end of the directory table.

The values shown are instead pieces of the first import lookup table, and then the idata string table, which immediately follow the import directory table. objdump notices garbage and stops reading the directory table.

It's not just dlltool: ld --out-implib=... has the same result.

Linking at least one non-Binutils import library, in any position, fixes the problem. With all the implicitly-linked libraries present, this is nearly always the case — except for toolchain-hacking weirdos — so the bug almost never manifests in practice.

I suspect maybe Bintuils' import libraries are not quite formatted properly, which confuses link.exe. It's still a linker bug, because it shouldn't quietly link a broken PE image. Since it's unlikely I could get Microsoft to fix link.exe, perhaps it could be addressed in Bintuils. This is where things get more complicated. The import library format is undocumented — so who knows what's "correct" or not — and Bintuils' import libraries are especially messy, making them difficult to pick through by hand. Figuring this out would be the next step.

Peter0x44 commented 2 weeks ago

It gets even more complicated - binutils and msvc are apparently creating different import library formats. I had this discussion in the msys2 discord server.

There kind of is, Binutils generates legacy (sometimes named long) import library while LLVM and MSVC generate modern short import libraries.

I also noticed a related discussion on the mingw-w64 irc channel:

<ovf> is it expected that mingw doesn't like msvc-produced import libs? surely ones that dlltool spits out are different, but i can't so far figure out in what way
<wbs> ovf: now there are many aspects to this
first off, if you mean C++ interfaces, they're fundamentally incompatible
for the actual import libs, yes, MSVC (and llvm based mingw tools as well!) use a different format of import libraries than what GNU tools produce
however, LLVM (and LLVM's linker lld) can work with both.  GNU ld also can link against the MSVC/LLVM import libraries. there have been a couple of bugs in that support, but I've tried to get them fixed

Probably not entirely related, but I add it for some context. This was news to me when I learned about it.

skeeto commented 2 weeks ago

Thanks for the followup! I hadn't realized there were "short" and "long" formats, which sheds light on the situation. As noted in your findings, these toolchains are typically compatible with import libraries. This is just an odd edge case I noticed in link.exe.

Taking another look, I noticed the PE spec incompletely documents both formats, but neither MSVC lib.exe nor Bintuils outputs quite match the PE spec for either. When I tried to craft a "short" import library — writing a pseudo-COFF by hand, then creating an archive using a standard archive tool — I found two more toolchain bugs: an MSVC lib.exe "internal error" crash that produces a corrupt archive, and Binutils ar memory corruption also producing a corrupt archive. Seems neither archiver likes consuming pseudo-COFFs. To keep trying, I'll need to also craft the archive itself.


For the record, to reproduce the two archiver bugs, first write a pseudo-COFF for a symbol foo in module foo.dll:

$ printf '\x00\x00\xff\xff\x00\x00\x64\x86\x00\x00\x00\x00\x0c\x00\x00\x00\x0a\x00\x00\x00foo\x00foo.dll\x00' >foo.dll

Try to archive it with lib.exe:

$ lib /nologo /out:foo.lib foo.dll

LINK : fatal error LNK1000: Internal error during LibrarianMain

The resulting foo.lib is a 2MiB file of all zeros. Try again with ar:

$ rm foo.lib
$ ar Dr foo.lib foo.dll

So far everything seems fine, but all three linkers reject it as invalid. If I manually examine the contents I see the foo.dll member has been corrupted. Pulling it back out I just get garbage:

$ ar x foo.lib
$ hd foo.dll
00000000  d0 8f a1 fe 20 01 00 00  d4 36 90 fe 20 01 00 00  |.... ....6.. ...|
00000010  00 00 00 00 00 00 00 00  03 00 00 00 00 00 00 00  |................|
00000020

If I archive with SDr to disable the ranlib index, the corruption disappears but the archive doesn't quite match the (goofy) format required for the PE spec, so that's still no good.

(It's pretty frustrating that the more I dig, the more bugs I turn up.)