Open joshtriplett opened 5 years ago
I don't think I have the expertise to implement this feature, but I could write some (currently failing) tests for use by the implementor?
(Background, Ran into this exact problem linking mscorlib.dll into executables. Had to write a C++ level wrapper to get it to work.)
A very first PR has landed, now the link_ordinal
attribute is added. Want to say thank you to @Centril for helping me on this~
There's now a bounty for this issue to incentivize someone to work on implementing this feature.
https://www.bountysource.com/issues/70533351-tracking-issue-for-rfc-2627-link-kind-raw-dylib
Some information I've gathered:
.idata
section. Despite the syntax suggesting the opposite, the information is purely additive and doesn't change ANYTHING about the import itself (additional attribute in LLVM IR, etc). Roughly, .lib files that this feature wants to make unneccessary are condensed idata sections and the linker extracts the relevant ones for us. This is good for us because we don't need LLVM to support any custom attributes. Not tried it, but emitting additional sections should be enough, and there is LLVM API support for adding sections..idata
section but rather multiple .idata$2
, .idata$3
, .idata$4
, .idata$6
, .idata$7
ones. See this part of the spec that explains the rationalelibrustc_codegen_llvm
crate. One probably has to invoke the llvm::LLVMSetSection
function in some fashion (just giving you something to grep for)To my best knowledge, these are the meanings of the .idata$*
sections. See the spec/example for further understanding:
section | name in the MS spec | description |
---|---|---|
.idata$2 |
Import directory table | main list containing pointers into the other tables. Each entry in the list concerns one dll and contains pointers to the import address/lookup table, dll name table |
.idata$3 |
Import Address Table | At compile time same as import lookup table. Just duplicate the output! Not sure whether it's actually .idata$5 |
.idata$4 |
Import Lookup Table | Has pointers to the function names |
.idata$6 |
Hint/Name Table | Contains the names of the functions we want to import. Also contains export name table hints which are NOT the same as ordinals and we'll probably have to set to 0 |
.idata$7 |
No name in the MS spec, let's call it DLL name table | Contains the names for the dll files |
The main table is the import directory table which contains pointers into the other tables. Linkers are smart enough to assemble all the relative pointers.
As for who's implementing it, I've stopped contributing patches to the compiler years ago. Better someone else does it. Hope that with my help instructions and @retep998 's bounty there will be some movement on this. Of course I'm around to answer questions.
that has been invaluable so far, thank you
Just as a note about prior art, Apple stopped shipping .dylibs in their SDKs a while ago in favor of "TBD files" which are just YAML descriptions of shared libraries with enough information to link against them. Unfortunately they have zero actual documentation about them available, but the Swift compiler seems to support both generating and linking against them: https://github.com/apple/swift/blob/5d8af8ccf55f3d3dd7ad55dd8fbdc97d15c5b6bf/test/TBD/linking-with-tbd.swift https://github.com/apple/swift/blob/5d8af8ccf55f3d3dd7ad55dd8fbdc97d15c5b6bf/lib/FrontendTool/TBD.cpp
Those "TBD files" are just the YAML equivalent of import libraries on Windows and are still separate files from the source code, so it isn't that novel.
For more prior art, Go uses comments. Windows example, Mac OS example.
More prior art: C# has attributes similar to Rust.
The issue description lists an item "Implementation for Linux and other platforms". The RFC doesn't really talk about it. Linux also doesn't support mapping undefined symbols to specific shared objects. Is the intention to let you specify a symbol version on Linux? Or to add a DT_NEEDED item to the dynamic section without needing the corresponding shared object on your filesystem?
Or to add a DT_NEEDED item to the dynamic section without needing the corresponding shared object on your filesystem?
This would actually be useful to me, I'm now generating a fake shared object to link against: https://github.com/fortanix/rust-sgx/commit/b9c29351b1291eb133e2a54d179784445b9ce665#diff-f52ac431696817278bbfebda2cffffd3
@jethrogb The latter; raw-dylib
on Linux should allow linking to a specified shared library without having it available at build time. That would simplify building some kinds of -sys crates.
This is unrelated to symbol versioning. In the future, we should have better ways of handling versioned symbols (both exporting and importing), but that doesn't need to happen right away, and it shouldn't block this.
One other benefit of raw-dylib on Linux: it would enable linking to the vDSO, which works like a shared library but doesn't exist on the filesystem to link against.
Yes that's exactly what I'm talking about in https://github.com/rust-lang/rust/issues/58713#issuecomment-703095092.
@jethrogb Ah, sorry, I hadn't followed that link. Yes, that's the exact case I was thinking of. :)
Hello, @joshtriplett and @retep998 . I'm a Windows developer (as in, working on Windows itself, at Microsoft), and I'd like to help advance this RFC.
Is there an existing prototype implementation?
The spec seems to cover all of our requirements. The main thing that is missing is how (or whether) to support delay-binding imports. PE/COFF has tables that describe how to represent delay load import tables (https://docs.microsoft.com/en-us/windows/win32/debug/pe-format#delay-load-import-tables-image-only). It would be great to cover that in the RFC, even if it isn't implemented in a first version of the support for this. I'd be happy to write up text for handling delay-loads, and submit a PR for the RFC.
hey @sivadeilra - i have a draft pr here, but we were unsure that writing the object files directly was the way to go - i was thinking of revamping the pr to generate "short import" .lib files and passing those to the linker back in august, but i've never got around to it
there's more discussion on this over on the zulip topic:
Since this issue is referencing my (old) example, I figure I should post a long-overdue correction to one of my statements, which is repeated here.
In my example, and related communication, I implied that the idata section is part of the PE format, and is the only thing necessary to define DLL imports. This is inaccurate. Windows itself does not, in fact, attribute any special meaning to the idata section. Rather, a pointer within the PE header points at the start of the import table. In practice, this is almost always the start of the idata section, but there's no particular reason why it couldn't be in the middle of the code section.
The real importance of the idata section, is that both MSVC's and MinGW's linkers will detect its presence in object files, and automatically set the PE header's import table pointer to the start of that section, if it is present. This is the reason why that pointer almost universally points at the start of the idata section. The (very) unfortunate (and ironic) exception to this, however, is LLD.
Disclaimer: This following information was gathered years ago, and could very well be outdated. More recent testing is necessary.
LLD does not give special treatment to idata sections inside object files. Instead, it has a special code path for handling files with a "lib" extension. When such a file is provided as input, LLD checks a few markers to see if it's an import library. If it passes the test, LLD actually decompiles the idata section inside it as an import table, into its own internal data structure. After going over all the files, LLD recompiles the import data it gathered into its own original idata section, and sets the import table pointer to that section. If LLD's internal import data structure is empty, it simply sets the import table pointer to NULL, even if an idata section exists in the object code. At the time I've looked at its code (several years ago) this was the only way to set the import table pointer in the PE header.
Obviously, this method is horribly convoluted, and hopefully, someone at the LLD dev team has realized this and corrected it in the years that have passed since I've last checked. If not, better late than never.
In the mean time, a usable workaround is to produce a lib file instead of an obj file, essentially generating custom import libraries, similar to the dll tools available in the MSVC toolchain. I haven't tested this, and it's been years since I've looked at LLD's code, but it might be as simple as renaming the "obj" file to "lib". Do note that if LLD does detect that a "lib" file is an import library, it will discard any code within it, so the import table and code have to be separated into different files.
This makes me wonder: how does gollvm do it? Do they generate custom idata sections? They support linking with various linkers iirc.
Also. The Zig programming language can cross-compile. It might be worth looking at that.
I'm looking into implementing this support, but I'm still very much investigating and learning about the problem. Do folks mind if I assign this issue to myself while I do, or should I wait until I'm closer to implementation?
@rustbot claim
@ricobbe ive been working on a lot of it, and id love to help out if i can - ive tried a bunch of different approaches, and never quite got a working solution
id love to chat about it if possible
Just failing with a compile error if on non windows platforms seems best for now – adding support for whatever “raw_dylib” means on Linux / mac can happen later
On Tue, Mar 16, 2021 at 04:27:40PM -0700, Richard Cobbe wrote:
I think I'm closing in on an implementation for the simple case (Windows-only, no support for
#[link_name]
or#[link_ordinal]
), and I don't see significant obstacles to supporting those two attributes.I do have one question that I don't see addressed in the RFC, however: should rustc allow
#[link(.., kind = "raw-dylib")]
on non-Windows platforms? Or is the assumption that we'll also implement Linux support as described in the "Future possibilities" section of the RFC, making this question moot? (And if so, what about MacOS?) I'd like to get support for this feature on Windows complete without gating it behind Linux and MacOS (as indicated by the RFC), as this unblocks other things we're working on, but I'm willing to discuss this point.
Implementing Windows first seems fine; that's the platform that needs it most.
Current state of this work:
windows
crate under a feature flagWe believe the work has progressed far enough for the winapi
crate to begin using it under a feature flag.
As the work so far provides a lot of functionality for Rust devs trying to interoperate with the Windows API, we'd like to split this off as a separate feature and begin stabilization. There's obviously some additional work that needs to happen on our end before this is possible, primarily ensuring good test coverage and documenting the work completed so far in the unstable book. I'll start working on those items today/early next week.
the major missing feature is ordinal imports, right?
the major missing feature is ordinal imports, right?
That's correct, yes.
It works marvelously. Ship it! 😎
@ricobbe I remember the PR which added initial support only added support for the proprietary msvc toolchain. Does it work with the GNU toolchain yet? Does it work with LLD?
@est31 it works for windows-gnu
when using LLD but doesn't work with the default BFD linker.
PR for addition to unstable book: #87315
@retep998
Not sure if you've been following this issue, so I just wanted to make sure you were aware that "raw-dylib" should be usable in the winapi crate, to get some real-world experience with the feature. I'd love to hear about any issues you encounter!
I have a question about one of the requirements in the RFC, in the second paragraph of the "Motivation" section. Specifically, which of the two options below is the use case we care about?
f
from library A.DLL and function g
from library B.DLL. However, B.DLL also exports a function named f
, and we need to avoid confusion during linking.f
from library A.DLL and function f
from library B.DLL.IIUC, the MSVC toolchain doesn't support use case 2 even when working entirely in C, although I haven't verified that yet myself. Figured I'd clarify the requirements before digging into that in more detail.
@ricobbe This RFC supports both situations via #[link_ordinal]
and #[link_name]
attributes. So, to answer your question,
mod all_imports_together {
#[link(name = "a.dll", kind = "raw-dylib")]
extern "system" {
#[link_name("f")] fn a_f() -> i32;
}
#[link(name = "b.dll", kind = "raw-dylib")]
extern "system" {
#[link_name("f")] fn b_f() -> i32;
}
}
However, it will be more realistic to put different imports into different modules ¯\_(ツ)_/¯
mod a {
#[link(name = "a.dll", kind = "raw-dylib")]
extern "system" {
fn f() -> i32;
}
}
mod b {
#[link(name = "b.dll", kind = "raw-dylib")]
extern "system" {
fn f() -> i32;
fn g() -> i32;
}
}
fn main() {
a::f();
b::f();
}
Similar, in Delphi it is separated on a language level as well:
function FunctionName(): Result; stdcall; external 'name.dll' index 1;
function MessageBox(
hWnd: HWND;
lpText: PWideChar;
lpCaption: PWideChar;
uType: UINT
): Integer; stdcall; external 'user32.dll' name 'MessageBoxW';
As you see, they support both ordinal and (re)named imports.
I have a question about one of the requirements in the RFC, in the second paragraph of the "Motivation" section. Specifically, which of the two options below is the use case we care about?
1. A crate needs to call function `f` from library A.DLL and function `g` from library B.DLL. However, B.DLL _also_ exports a function named `f`, and we need to avoid confusion during linking. 2. A crate needs to call function `f` from library A.DLL _and_ function `f` from library B.DLL.
IIUC, the MSVC toolchain doesn't support use case 2 even when working entirely in C, although I haven't verified that yet myself. Figured I'd clarify the requirements before digging into that in more detail.
Use case 2 should be supportable with this feature as specified in the RFC. Import libraries are a mapping from a symbol name to a symbol name or ordinal plus a dll (HashMap<Symbol, (SymbolOrOrdinal, Dll)>
). Normally in the C world this would look like "_foo", -> ("foo", "bar.dll")
or similar, however in Rust we can make the key a mangled Rust symbol! As a result we can have two unique keys for two symbols, that resolve to symbols with the same name but different dlls!
Use case 2 should be supportable with this feature as specified in the RFC. Import libraries are a mapping from a symbol name to a symbol name or ordinal plus a dll (
HashMap<Symbol, (SymbolOrOrdinal, Dll)>
). Normally in the C world this would look like"_foo", -> ("foo", "bar.dll")
or similar, however in Rust we can make the key a mangled Rust symbol! As a result we can have two unique keys for two symbols, that resolve to symbols with the same name but different dlls!
It's not clear that the MSVC linker supports this in practice. I tried something very close to the second example from @pravic's most recent comment (omitting only the declaration of g
); the implementation of the two functions f
in the various DLLs basically just printed "f from A.DLL"
and "f from B.DLL"
. While I expected to see A.DLL's f and B.DLL's f called once each, in that order, I actually saw A.DLL's f called twice.
I'd like to try an equivalent example written entirely in C, to determine if the error is in rustc or MSVC. However, I don't know of a way to declare two separate functions, both named f
, that come from different libraries, without getting a "function redefined" error. I don't think MSVC has anything equivalent to the #[link_name]
attribute, but please let me know if I'm wrong.
If you'd like to take a look at what I tried, in case I omitted anything, you can find it here on my github fork.
It might be possible to achieve the desired results by using #[link-ordinal]
, but we haven't implemented that functionality yet.
@ricobbe I believe if you create the import tables in LLVM similar to the example I gave in #30027, then link with MSVC's linker, it should work. The example can easily be extended to two DLLs simply by addin another descriptor to the .idata$2
section, before the null descriptor. It should also work with MinGW's linker. However, I would repeat the above note that, based on my previous test, it does not work with LLVM's native linker. What more, assuming LLVM's native linker still treats libraries the same way, it probably couldn't work, as it wouldn't allow custom name mangling.
P.S. It might be worthwhile to use a hex editor, or perhaps a small utility script, to read the PE header on the final executable, and see that the import table looks right. That's how I figured out LLVM's linker was leaving the import table pointer as null, instead of pointing it to the .idata
section.
Is this functionality (multiple libraries that export the same symbol) necessary in order to stabilize the work as already completed?
I'd argue that, even without this functionality, current support for raw-dylib enables a large number of important use cases, such as interoperability with the Windows API. Also, since it doesn't appear that C and C++ even allow programmers to write programs that attempt to use symbols of the same name from different libraries, wouldn't the current implementation be sufficient in many cases when interoperating with DLLs intended for use by C and C++ programs?
The conversation around stabilization seems to have stalled. Is there anything outstanding that would prevent moving the functionality that's already been implemented into stabilization?
@joshtriplett
As a side note: it doesn't look like I'm going to be able to devote much time or energy towards addressing the remaining functionality, as we believe that the work done so far handles our use case (interoperability between Rust and the Windows APIs). Would anyone mind if I removed myself as the person assigned to this issue?
And to be explicit about our commitment: we do expect issues with the completed work to arise during stabilization as we get real-world experience working with the feature, and I certainly intend to take the lead on helping to resolve those. It's just a question of who will drive implementing the remaining functionality in the RFC.
Right now, this feature is only partially supported on the *-windows-gnu
target family, which has several targets listed as tier 1. In order to use the feature, users of those targets have use the LLD linker, which is not the default. LLD is really awesome, but the manual step needed to use LLD would make the setup story for users of those targets harder. Maybe this can be resolved by progressing on the LLD rollout plan #39915 for *-windows-gnu
targets?
There is also a lack of *-windows-gnu
based tests. I can't find a test that checks that raw-dylib works with the *-windows-gnu
target together with LLD. For example, this test still has only-windows-msvc
. There is a test for a warning warning for the gnu targets, that says the feature doesn't work there. Is it turned off when LLD is in use?
If I had to name an issue blocking stabilization, it would be better *-windows-gnu
support:
*-windows-gnu
works with the feature when using LLD*-windows-gnu
linker (#39915) OR better support in the BFD linker (unlikely, but mentioning that this is also an option). At the very least there should be good docs for how to set LLD as the linker.@est31
Right now, this feature is only partially supported on the
*-windows-gnu
target family, which has several targets listed as tier 1. In order to use the feature, users of those targets have use the LLD linker, which is not the default.
The raw-dylib feature as currently implemented deliberately does not support the *-windows-gnu
targets, and so the only relevant test is a UI test to ensure that we issue a diagnostic upon encountering raw-dylib on those targets.
I did explore the possibility of using LLD, but @mati865 informed me (in a private communication on April 29) that LLD was not supported in automated testing for rustc. I'm deeply reluctant to claim to support a feature that cannot be tested in automation. If that restriction is no longer in effect, then I'm willing to investigate using it to support this functionality.
I agree that officially supporting something without tests isn't good. As for LLD support in the test suite, I wonder how it's solved for wasm targets, because IIRC there, only LLD is being used. Eventually as rustc migrates to LLD, this problem has to be solved anyways, no?
Eventually as rustc migrates to LLD, this problem has to be solved anyways, no?
I assume so, yeah. I think the important question for now, however, is whether solving the LLD-in-automation problem should be required in order to stabilize this work for the windows-msvc platforms.
LLD as the default windows-gnu linker has big question mark next to it: BFD. LLD works fine with all kinds of import libraries but emits only one type, the one that BFD has issues with. That means we cannot guarantee that dynamic libraries will be linkable by platform default linker when defaulting to LLD.
BTW I hope to add purely LLVM based mingw-w64 targets to Rust but haven't got the time to work on upstreaming it (x86_64 has been already well tested within MSYS2). It could act as a confirmation that LLD works fine for this feature.
Has anyone already looked into this:
Implementation of a pure Rust target for Windows (no libc, no msvc, no mingw). This may require another RFC
I would imagine that we have to create a new target e.g. x86_64-pc-windows-raw
and for that target link against all of the Win32 functions in std with kind="raw-dylib"
.
We than probably also have to default to rust-lld and implement everything that happens before main e.g. exe_common.ini
and the rest of the vcruntime ourselves.
Is there something I forgot?
I did parts of this, while building a toy no_std raw-dylib Hello World which compiles without problems under linux (without mingw or msvc, just pure Rust).
Status update: as before, we'd love to see this feature taken up by the winapi and windows-rs crates, to get some real-world experience with it. Since my last message, I've learned that many users of these crates run the Rust toolchain on Linux to produce Windows binaries, and that the windows-gnu target is crucial for these users. So I'm trying to understand exactly why the current implementation fails for windows-gnu builds, and what it would take to fix this.
What I know so far:
src/test/run-make/raw-dylib-c
(tweaked slightly to run on windows-gnu) causes an access violation upon trying to call one of the functions imported from a raw-dylib DLL.I'm fairly new to dealing with libraries and linking at this low level of abstraction, so figuring out exactly what's going on here is likely to take me some time, but I am continuing to work on it.
I'm a little confused by the last comment, as the big reason you want windows-gnu to build on Linux is to avoid a dependency on the non-redistributable Windows SDK import libraries, the very thing this would resolve - if you copy those files over to Linux and set linker flavor to lld-link you can build windows-msvc fine on Linux. Sure, you could also have issues with native code, but that's not something Rust can fix.
That said, it's not like I'm against having it working on windows-gnu!
This is the tracking issue for RFC 2627,
#[link(kind="raw-dylib")]
.Note:
raw-dylib
andlink_ordinal
are now stabilized in 1.65 on Windows for x86_64, aarch64, and thumbv7a (not 32-bit x86) via #99916.Opens:
link_ordinal
attribute (#89025)