rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
98.2k stars 12.7k forks source link

Tracking issue for RFC 2627: #[link(kind="raw-dylib")] #58713

Open joshtriplett opened 5 years ago

joshtriplett commented 5 years ago

This is the tracking issue for RFC 2627, #[link(kind="raw-dylib")].

Note: raw-dylib and link_ordinal are now stabilized in 1.65 on Windows for x86_64, aarch64, and thumbv7a (not 32-bit x86) via #99916.

Opens:

crlf0710 commented 5 years ago

cc https://github.com/rust-lang/rfcs/issues/1061

ZerothLaw commented 5 years ago

I don't think I have the expertise to implement this feature, but I could write some (currently failing) tests for use by the implementor?

(Background, Ran into this exact problem linking mscorlib.dll into executables. Had to write a C++ level wrapper to get it to work.)

crlf0710 commented 5 years ago

A very first PR has landed, now the link_ordinal attribute is added. Want to say thank you to @Centril for helping me on this~

retep998 commented 4 years ago

There's now a bounty for this issue to incentivize someone to work on implementing this feature.

https://www.bountysource.com/issues/70533351-tracking-issue-for-rfc-2627-link-kind-raw-dylib

est31 commented 4 years ago

Some information I've gathered:

To my best knowledge, these are the meanings of the .idata$* sections. See the spec/example for further understanding:

section name in the MS spec description
.idata$2 Import directory table main list containing pointers into the other tables. Each entry in the list concerns one dll and contains pointers to the import address/lookup table, dll name table
.idata$3 Import Address Table At compile time same as import lookup table. Just duplicate the output! Not sure whether it's actually .idata$5
.idata$4 Import Lookup Table Has pointers to the function names
.idata$6 Hint/Name Table Contains the names of the functions we want to import. Also contains export name table hints which are NOT the same as ordinals and we'll probably have to set to 0
.idata$7 No name in the MS spec, let's call it DLL name table Contains the names for the dll files

The main table is the import directory table which contains pointers into the other tables. Linkers are smart enough to assemble all the relative pointers.

As for who's implementing it, I've stopped contributing patches to the compiler years ago. Better someone else does it. Hope that with my help instructions and @retep998 's bounty there will be some movement on this. Of course I'm around to answer questions.

tinaun commented 4 years ago

that has been invaluable so far, thank you

luser commented 4 years ago

Just as a note about prior art, Apple stopped shipping .dylibs in their SDKs a while ago in favor of "TBD files" which are just YAML descriptions of shared libraries with enough information to link against them. Unfortunately they have zero actual documentation about them available, but the Swift compiler seems to support both generating and linking against them: https://github.com/apple/swift/blob/5d8af8ccf55f3d3dd7ad55dd8fbdc97d15c5b6bf/test/TBD/linking-with-tbd.swift https://github.com/apple/swift/blob/5d8af8ccf55f3d3dd7ad55dd8fbdc97d15c5b6bf/lib/FrontendTool/TBD.cpp

retep998 commented 4 years ago

Those "TBD files" are just the YAML equivalent of import libraries on Windows and are still separate files from the source code, so it isn't that novel.

est31 commented 4 years ago

For more prior art, Go uses comments. Windows example, Mac OS example.

est31 commented 4 years ago

More prior art: C# has attributes similar to Rust.

jethrogb commented 4 years ago

The issue description lists an item "Implementation for Linux and other platforms". The RFC doesn't really talk about it. Linux also doesn't support mapping undefined symbols to specific shared objects. Is the intention to let you specify a symbol version on Linux? Or to add a DT_NEEDED item to the dynamic section without needing the corresponding shared object on your filesystem?

jethrogb commented 4 years ago

Or to add a DT_NEEDED item to the dynamic section without needing the corresponding shared object on your filesystem?

This would actually be useful to me, I'm now generating a fake shared object to link against: https://github.com/fortanix/rust-sgx/commit/b9c29351b1291eb133e2a54d179784445b9ce665#diff-f52ac431696817278bbfebda2cffffd3

joshtriplett commented 4 years ago

@jethrogb The latter; raw-dylib on Linux should allow linking to a specified shared library without having it available at build time. That would simplify building some kinds of -sys crates.

This is unrelated to symbol versioning. In the future, we should have better ways of handling versioned symbols (both exporting and importing), but that doesn't need to happen right away, and it shouldn't block this.

joshtriplett commented 4 years ago

One other benefit of raw-dylib on Linux: it would enable linking to the vDSO, which works like a shared library but doesn't exist on the filesystem to link against.

jethrogb commented 4 years ago

Yes that's exactly what I'm talking about in https://github.com/rust-lang/rust/issues/58713#issuecomment-703095092.

joshtriplett commented 4 years ago

@jethrogb Ah, sorry, I hadn't followed that link. Yes, that's the exact case I was thinking of. :)

sivadeilra commented 4 years ago

Hello, @joshtriplett and @retep998 . I'm a Windows developer (as in, working on Windows itself, at Microsoft), and I'd like to help advance this RFC.

Is there an existing prototype implementation?

The spec seems to cover all of our requirements. The main thing that is missing is how (or whether) to support delay-binding imports. PE/COFF has tables that describe how to represent delay load import tables (https://docs.microsoft.com/en-us/windows/win32/debug/pe-format#delay-load-import-tables-image-only). It would be great to cover that in the RFC, even if it isn't implemented in a first version of the support for this. I'd be happy to write up text for handling delay-loads, and submit a PR for the RFC.

tinaun commented 4 years ago

hey @sivadeilra - i have a draft pr here, but we were unsure that writing the object files directly was the way to go - i was thinking of revamping the pr to generate "short import" .lib files and passing those to the linker back in august, but i've never got around to it

tinaun commented 4 years ago

there's more discussion on this over on the zulip topic:

https://rust-lang.zulipchat.com/#narrow/stream/242869-t-compiler.2Fwindows/topic/Tracking.20raw_dylib.20progress.2E

SlugFiller commented 4 years ago

Since this issue is referencing my (old) example, I figure I should post a long-overdue correction to one of my statements, which is repeated here.

In my example, and related communication, I implied that the idata section is part of the PE format, and is the only thing necessary to define DLL imports. This is inaccurate. Windows itself does not, in fact, attribute any special meaning to the idata section. Rather, a pointer within the PE header points at the start of the import table. In practice, this is almost always the start of the idata section, but there's no particular reason why it couldn't be in the middle of the code section.

The real importance of the idata section, is that both MSVC's and MinGW's linkers will detect its presence in object files, and automatically set the PE header's import table pointer to the start of that section, if it is present. This is the reason why that pointer almost universally points at the start of the idata section. The (very) unfortunate (and ironic) exception to this, however, is LLD.

Disclaimer: This following information was gathered years ago, and could very well be outdated. More recent testing is necessary.

LLD does not give special treatment to idata sections inside object files. Instead, it has a special code path for handling files with a "lib" extension. When such a file is provided as input, LLD checks a few markers to see if it's an import library. If it passes the test, LLD actually decompiles the idata section inside it as an import table, into its own internal data structure. After going over all the files, LLD recompiles the import data it gathered into its own original idata section, and sets the import table pointer to that section. If LLD's internal import data structure is empty, it simply sets the import table pointer to NULL, even if an idata section exists in the object code. At the time I've looked at its code (several years ago) this was the only way to set the import table pointer in the PE header.

Obviously, this method is horribly convoluted, and hopefully, someone at the LLD dev team has realized this and corrected it in the years that have passed since I've last checked. If not, better late than never.

In the mean time, a usable workaround is to produce a lib file instead of an obj file, essentially generating custom import libraries, similar to the dll tools available in the MSVC toolchain. I haven't tested this, and it's been years since I've looked at LLD's code, but it might be as simple as renaming the "obj" file to "lib". Do note that if LLD does detect that a "lib" file is an import library, it will discard any code within it, so the import table and code have to be separated into different files.

est31 commented 4 years ago

This makes me wonder: how does gollvm do it? Do they generate custom idata sections? They support linking with various linkers iirc.

Keithcat1 commented 3 years ago

Also. The Zig programming language can cross-compile. It might be worth looking at that.

ricobbe commented 3 years ago

I'm looking into implementing this support, but I'm still very much investigating and learning about the problem. Do folks mind if I assign this issue to myself while I do, or should I wait until I'm closer to implementation?

ricobbe commented 3 years ago

@rustbot claim

tinaun commented 3 years ago

@ricobbe ive been working on a lot of it, and id love to help out if i can - ive tried a bunch of different approaches, and never quite got a working solution

id love to chat about it if possible

tinaun commented 3 years ago

Just failing with a compile error if on non windows platforms seems best for now – adding support for whatever “raw_dylib” means on Linux / mac can happen later

joshtriplett commented 3 years ago

On Tue, Mar 16, 2021 at 04:27:40PM -0700, Richard Cobbe wrote:

I think I'm closing in on an implementation for the simple case (Windows-only, no support for #[link_name] or #[link_ordinal]), and I don't see significant obstacles to supporting those two attributes.

I do have one question that I don't see addressed in the RFC, however: should rustc allow #[link(.., kind = "raw-dylib")] on non-Windows platforms? Or is the assumption that we'll also implement Linux support as described in the "Future possibilities" section of the RFC, making this question moot? (And if so, what about MacOS?) I'd like to get support for this feature on Windows complete without gating it behind Linux and MacOS (as indicated by the RFC), as this unblocks other things we're working on, but I'm willing to discuss this point.

Implementing Windows first seems fine; that's the platform that needs it most.

ricobbe commented 3 years ago

Current state of this work:

We believe the work has progressed far enough for the winapi crate to begin using it under a feature flag.

As the work so far provides a lot of functionality for Rust devs trying to interoperate with the Windows API, we'd like to split this off as a separate feature and begin stabilization. There's obviously some additional work that needs to happen on our end before this is possible, primarily ensuring good test coverage and documenting the work completed so far in the unstable book. I'll start working on those items today/early next week.

tinaun commented 3 years ago

the major missing feature is ordinal imports, right?

ricobbe commented 3 years ago

the major missing feature is ordinal imports, right?

That's correct, yes.

kennykerr commented 3 years ago

It works marvelously. Ship it! 😎

est31 commented 3 years ago

@ricobbe I remember the PR which added initial support only added support for the proprietary msvc toolchain. Does it work with the GNU toolchain yet? Does it work with LLD?

mati865 commented 3 years ago

@est31 it works for windows-gnu when using LLD but doesn't work with the default BFD linker.

ricobbe commented 3 years ago

PR for addition to unstable book: #87315

ricobbe commented 3 years ago

@retep998

Not sure if you've been following this issue, so I just wanted to make sure you were aware that "raw-dylib" should be usable in the winapi crate, to get some real-world experience with the feature. I'd love to hear about any issues you encounter!

ricobbe commented 3 years ago

I have a question about one of the requirements in the RFC, in the second paragraph of the "Motivation" section. Specifically, which of the two options below is the use case we care about?

  1. A crate needs to call function f from library A.DLL and function g from library B.DLL. However, B.DLL also exports a function named f, and we need to avoid confusion during linking.
  2. A crate needs to call function f from library A.DLL and function f from library B.DLL.

IIUC, the MSVC toolchain doesn't support use case 2 even when working entirely in C, although I haven't verified that yet myself. Figured I'd clarify the requirements before digging into that in more detail.

pravic commented 3 years ago

@ricobbe This RFC supports both situations via #[link_ordinal] and #[link_name] attributes. So, to answer your question,

mod all_imports_together {

#[link(name = "a.dll", kind = "raw-dylib")]
extern "system" {
    #[link_name("f")] fn a_f() -> i32;
}

#[link(name = "b.dll", kind = "raw-dylib")]
extern "system" {
    #[link_name("f")] fn b_f() -> i32;
}
}

However, it will be more realistic to put different imports into different modules ¯\_(ツ)_/¯

mod a {
#[link(name = "a.dll", kind = "raw-dylib")]
extern "system" {
    fn f() -> i32;
}
}

mod b {
#[link(name = "b.dll", kind = "raw-dylib")]
extern "system" {
    fn f() -> i32;
    fn g() -> i32;
}
}

fn main() {
  a::f();
  b::f();
}

Similar, in Delphi it is separated on a language level as well:

function FunctionName(): Result; stdcall; external 'name.dll' index 1;

function MessageBox(
    hWnd: HWND; 
    lpText: PWideChar;
    lpCaption: PWideChar; 
    uType: UINT
 ): Integer; stdcall; external 'user32.dll' name 'MessageBoxW';

As you see, they support both ordinal and (re)named imports.

retep998 commented 3 years ago

I have a question about one of the requirements in the RFC, in the second paragraph of the "Motivation" section. Specifically, which of the two options below is the use case we care about?

1. A crate needs to call function `f` from library A.DLL and function `g` from library B.DLL.  However, B.DLL _also_ exports a function named `f`, and we need to avoid confusion during linking.

2. A crate needs to call function `f` from library A.DLL _and_ function `f` from library B.DLL.

IIUC, the MSVC toolchain doesn't support use case 2 even when working entirely in C, although I haven't verified that yet myself. Figured I'd clarify the requirements before digging into that in more detail.

Use case 2 should be supportable with this feature as specified in the RFC. Import libraries are a mapping from a symbol name to a symbol name or ordinal plus a dll (HashMap<Symbol, (SymbolOrOrdinal, Dll)>). Normally in the C world this would look like "_foo", -> ("foo", "bar.dll") or similar, however in Rust we can make the key a mangled Rust symbol! As a result we can have two unique keys for two symbols, that resolve to symbols with the same name but different dlls!

ricobbe commented 3 years ago

Use case 2 should be supportable with this feature as specified in the RFC. Import libraries are a mapping from a symbol name to a symbol name or ordinal plus a dll (HashMap<Symbol, (SymbolOrOrdinal, Dll)>). Normally in the C world this would look like "_foo", -> ("foo", "bar.dll") or similar, however in Rust we can make the key a mangled Rust symbol! As a result we can have two unique keys for two symbols, that resolve to symbols with the same name but different dlls!

It's not clear that the MSVC linker supports this in practice. I tried something very close to the second example from @pravic's most recent comment (omitting only the declaration of g); the implementation of the two functions f in the various DLLs basically just printed "f from A.DLL" and "f from B.DLL". While I expected to see A.DLL's f and B.DLL's f called once each, in that order, I actually saw A.DLL's f called twice.

I'd like to try an equivalent example written entirely in C, to determine if the error is in rustc or MSVC. However, I don't know of a way to declare two separate functions, both named f, that come from different libraries, without getting a "function redefined" error. I don't think MSVC has anything equivalent to the #[link_name] attribute, but please let me know if I'm wrong.

If you'd like to take a look at what I tried, in case I omitted anything, you can find it here on my github fork.

It might be possible to achieve the desired results by using #[link-ordinal], but we haven't implemented that functionality yet.

SlugFiller commented 3 years ago

@ricobbe I believe if you create the import tables in LLVM similar to the example I gave in #30027, then link with MSVC's linker, it should work. The example can easily be extended to two DLLs simply by addin another descriptor to the .idata$2 section, before the null descriptor. It should also work with MinGW's linker. However, I would repeat the above note that, based on my previous test, it does not work with LLVM's native linker. What more, assuming LLVM's native linker still treats libraries the same way, it probably couldn't work, as it wouldn't allow custom name mangling.

P.S. It might be worthwhile to use a hex editor, or perhaps a small utility script, to read the PE header on the final executable, and see that the import table looks right. That's how I figured out LLVM's linker was leaving the import table pointer as null, instead of pointing it to the .idata section.

ricobbe commented 3 years ago

Is this functionality (multiple libraries that export the same symbol) necessary in order to stabilize the work as already completed?

I'd argue that, even without this functionality, current support for raw-dylib enables a large number of important use cases, such as interoperability with the Windows API. Also, since it doesn't appear that C and C++ even allow programmers to write programs that attempt to use symbols of the same name from different libraries, wouldn't the current implementation be sufficient in many cases when interoperating with DLLs intended for use by C and C++ programs?

ricobbe commented 3 years ago

The conversation around stabilization seems to have stalled. Is there anything outstanding that would prevent moving the functionality that's already been implemented into stabilization?

@joshtriplett

As a side note: it doesn't look like I'm going to be able to devote much time or energy towards addressing the remaining functionality, as we believe that the work done so far handles our use case (interoperability between Rust and the Windows APIs). Would anyone mind if I removed myself as the person assigned to this issue?

And to be explicit about our commitment: we do expect issues with the completed work to arise during stabilization as we get real-world experience working with the feature, and I certainly intend to take the lead on helping to resolve those. It's just a question of who will drive implementing the remaining functionality in the RFC.

est31 commented 3 years ago

Right now, this feature is only partially supported on the *-windows-gnu target family, which has several targets listed as tier 1. In order to use the feature, users of those targets have use the LLD linker, which is not the default. LLD is really awesome, but the manual step needed to use LLD would make the setup story for users of those targets harder. Maybe this can be resolved by progressing on the LLD rollout plan #39915 for *-windows-gnu targets?

There is also a lack of *-windows-gnu based tests. I can't find a test that checks that raw-dylib works with the *-windows-gnu target together with LLD. For example, this test still has only-windows-msvc. There is a test for a warning warning for the gnu targets, that says the feature doesn't work there. Is it turned off when LLD is in use?

If I had to name an issue blocking stabilization, it would be better *-windows-gnu support:

ricobbe commented 3 years ago

@est31

Right now, this feature is only partially supported on the *-windows-gnu target family, which has several targets listed as tier 1. In order to use the feature, users of those targets have use the LLD linker, which is not the default.

The raw-dylib feature as currently implemented deliberately does not support the *-windows-gnu targets, and so the only relevant test is a UI test to ensure that we issue a diagnostic upon encountering raw-dylib on those targets.

I did explore the possibility of using LLD, but @mati865 informed me (in a private communication on April 29) that LLD was not supported in automated testing for rustc. I'm deeply reluctant to claim to support a feature that cannot be tested in automation. If that restriction is no longer in effect, then I'm willing to investigate using it to support this functionality.

est31 commented 3 years ago

I agree that officially supporting something without tests isn't good. As for LLD support in the test suite, I wonder how it's solved for wasm targets, because IIRC there, only LLD is being used. Eventually as rustc migrates to LLD, this problem has to be solved anyways, no?

ricobbe commented 3 years ago

Eventually as rustc migrates to LLD, this problem has to be solved anyways, no?

I assume so, yeah. I think the important question for now, however, is whether solving the LLD-in-automation problem should be required in order to stabilize this work for the windows-msvc platforms.

mati865 commented 3 years ago

LLD as the default windows-gnu linker has big question mark next to it: BFD. LLD works fine with all kinds of import libraries but emits only one type, the one that BFD has issues with. That means we cannot guarantee that dynamic libraries will be linkable by platform default linker when defaulting to LLD.

BTW I hope to add purely LLVM based mingw-w64 targets to Rust but haven't got the time to work on upstreaming it (x86_64 has been already well tested within MSYS2). It could act as a confirmation that LLD works fine for this feature.

clemenswasser commented 3 years ago

Has anyone already looked into this:

Implementation of a pure Rust target for Windows (no libc, no msvc, no mingw). This may require another RFC

I would imagine that we have to create a new target e.g. x86_64-pc-windows-raw and for that target link against all of the Win32 functions in std with kind="raw-dylib". We than probably also have to default to rust-lld and implement everything that happens before main e.g. exe_common.ini and the rest of the vcruntime ourselves. Is there something I forgot? I did parts of this, while building a toy no_std raw-dylib Hello World which compiles without problems under linux (without mingw or msvc, just pure Rust).

ricobbe commented 3 years ago

Status update: as before, we'd love to see this feature taken up by the winapi and windows-rs crates, to get some real-world experience with it. Since my last message, I've learned that many users of these crates run the Rust toolchain on Linux to produce Windows binaries, and that the windows-gnu target is crucial for these users. So I'm trying to understand exactly why the current implementation fails for windows-gnu builds, and what it would take to fix this.

What I know so far:

I'm fairly new to dealing with libraries and linking at this low level of abstraction, so figuring out exactly what's going on here is likely to take me some time, but I am continuing to work on it.

simonbuchan commented 3 years ago

I'm a little confused by the last comment, as the big reason you want windows-gnu to build on Linux is to avoid a dependency on the non-redistributable Windows SDK import libraries, the very thing this would resolve - if you copy those files over to Linux and set linker flavor to lld-link you can build windows-msvc fine on Linux. Sure, you could also have issues with native code, but that's not something Rust can fix.

That said, it's not like I'm against having it working on windows-gnu!