rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
98.28k stars 12.71k forks source link

Allow export of Rust symbols from a C shared object via a staticlib #73295

Open adetaylor opened 4 years ago

adetaylor commented 4 years ago

clang offers the ability to mark a symbol as exported or non-exported from a dynamic shared object (DSO) using __attribute__((visibility ("default"))) and __attribute__ ((visibility ("hidden"))). This issue requests the same in Rust, though it may be a bit more complex in the Rust case.

Another way of looking at it is to cleave the dual purpose of #[no_mangle] where it both sets visibility and alters the exported name.

The rest of this issue explains why this would be useful.

I am working in a pre-existing complex C++ codebase. It produces multiple DSOs, let’s say:

We want to build Rust code into each of those DSOs. Internally, the C++ and Rust code is intermixed freely (for example, the C++ JSON APIs within libbase.so call into a serde-based parser, which calls back into C++ to instantiate objects, etc.) There is no possibility of splitting these DSOs into separate DSOs for Rust and C++ code.

We build Rust code into these DSOs in the approved way, which is to aggregate a bunch of Rust libraries (rlibs) into a separate Rust staticlib for each of the DSOs. (For example, libbase_rust_deps.a and libservices_rust_deps.a). The final C++ linker links exactly one of these staticlibs together with the C++ .a and .o in the final construction of the DSO. That should be fine, since the whole point of a staticlib Rust target is to contain the Rust code and all its dependencies.

libservices.so depends on libbase.so. In C++, libservices.so uses symbols exposed from libbase.so. For example, there may be a CPP::Log function used by C++ code within both libbase.so and libservices.so. That’ll be exported from libbase.so using an __attribute__((visibility ("default"))) annotation in the source code for CPP::Log.

So far so good.

But now we want to put a Rust wrapper around that CPP::Log function, to make, say, rs_log. Ideally, there would be just a single implementation of rs_log in libbase.so. We want to be able to call that Rust wrapper from within the Rust parts of libbase.so or libservices.so.

This seems to be impossible.

Our options are:

(this latter case gives

error[E0463]: can't find crate for `rust_base`
  --> ./rust_static_libs/libservices_rust_deps/mod.rs:20:1
   |
20 | extern crate rust_base;
   | ^^^^^^^^^^^^^^^^^^^^^^^ can't find crate

)

Solution?

It’s a problem that #[no_mangle] has twin effects: specify a particular name for C linkage purposes, and mark the symbol for DSO export. We want to do the latter but not the former. https://github.com/rust-lang/rust/issues/54135#issuecomment-607935953 wants to do the former but not the latter. Can we separate those?

For our needs, ideally we’d mark rs_log with something like an #[dso_export] annotation. This would:

Related work

https://github.com/rust-lang/rust/issues/54135 requests something similar, but using #[used] on static variables - so I think it’s not quite the same thing. https://github.com/dtolnay/cxx/issues/219 proposes the workaround which we might need to do meanwhile (but isn’t ideal because, in the above example, RS::Log would be duplicated between two DSOs)

dtolnay commented 4 years ago

FYI @japaric @alexcrichton @retep998 in case you have a suggestion as the linkage experts.

Bottom line:

It’s a problem that #[no_mangle] has twin effects: specify a particular name for C linkage purposes, and mark the symbol for DSO export. We want to do the latter but not the former. https://github.com/rust-lang/rust/issues/54135#issuecomment-607935953 wants to do the former but not the latter. Can we separate those?

retep998 commented 4 years ago

If you want a shared library to be usable as a Rust crate it has to be built using the dylib crate type, because there is a lot of Rust specific metadata that gets bundled into the dylib. This is fundamentally at odds with compiling the Rust code into a staticlib and then linking that into a shared library. The dylib crate type is intended solely for Rust and its plugin model so it is unlikely to be expanded in this manner. Because linking against your shared library as a Rust crate is out of the question, there's also no need to preserve Rust name mangling so your concerns with #[no_mangle] are irrelevant.

The supported method is to export a function using the normal stable C ABI and then link against it as you would any other normal C ABI function. Using Rust types in that function signature will work if and only if you ensure that both sides are Rust where the Rust types comes from the exact same crate (as the Rust ABI is unstable, even a simple rebuild can change things)

alexcrichton commented 4 years ago

Controlling the visibility of a symbol is defintely something that's desired in Rust, but AFAIK it's just something no one's had enough time to sit down and design/RFC/etc. You might be able to play around on nightly with the #[linkage] attribute in one way or another.

The real problem here is that third bullet point, which is pretty nontrivial. The only thing I'd add is that in general it would only work for very simple datatypes. Even if the symbols were all hooked up you'd still have two Rust standard libraries linked in, and you can't, even with the same matching versions and all, pass datatypes between those two Rust standard libraries. There's some discussion of this on https://github.com/rust-lang/rust/pull/63338 and related issues.

Theoretically to get this working you'll need to make sure all the Rust dependencies are shared. There's Rust crates (including libstd) used by the Rust code between libbase.so and libservices.so. All those shared Rust crates need to be included exactly once and exported from libbase.so which all of libservices.so would then use. That's just how this could work linkage-wise, though, I'm not really sure how this could be set up build-wise beyond hand-crafting all the rustc invocations with various flags.

adetaylor commented 4 years ago

Thank you very much for the comments, and yes I feared this would be difficult around that third bullet.

In our build system (which is not cargo-based) we could in theory ensure there's exactly one copy of each crate — in the example given this would mean ensuring that libstd is within libbase.so and only within libbase.so. But, to do this we would need to:

Even if worked through the first three points, the binary size growth from the final point feels like it would be more than just duplicating our own Rust code into lots of different cdylibs. As such, I'm going to see if I can work around it instead.

Thanks again for the comments!

I'm happy to leave this issue open or close it, whichever's preferred.

hsivonen commented 4 years ago

For Gecko, we'd need the opposite: Without changing the Rust code, from the top level of Rust compilation, making FFI symbols visible to the static link step that combines Rust and C++ into libxul without leaving the FFI symbols visible to outside libxul.

Should this issue be expected to result in that case getting addressed, or should I open a new issue for that?

adetaylor commented 4 years ago

@hsivonen I think you want the same as @RReverser here: https://github.com/rust-lang/rust/issues/54135#issuecomment-607935953. (I'm not the best person to answer whether that should be in a new issue, I just wanted to make sure it was noted that you're not the only one to request that.)

hsivonen commented 4 years ago

Thanks. I filed a new issue.

RReverser commented 4 years ago

I believe both are essentially the same issue (a way to untangle linker visibility from no_mangle), but I guess no harm in tracking separately. Thanks!

adetaylor commented 2 years ago

Related: https://github.com/rust-lang/rust/issues/47384 https://github.com/rust-lang/rust/pull/91504 https://github.com/rust-lang/rust/issues/40289

ghost commented 2 years ago

For people looking for a way to export symbols from a staticlib. ld has --export-dynamic-symbol-list on linux and -exported_symbols_list on macos. The command I use on macos is:

clang -bundle -exported_symbols_list symbols.list -o libmyproject.so libmyproject.a

This works with -shared too. I haven't tested on linux, though.

danakj commented 2 years ago

For Chrome's purposes, we will be linking rlibs (in the future, --emit=obj or similar via https://github.com/rust-lang/rust/issues/73632) together into a .so with clang. Thus no use of crate-type=staticlib. And marking a symbol as pub in Rust does appear to make the symbol visible in that .so file. It needs to be tested and confirmed on Windows though. Work on this is happening in https://crbug.com/1296156

danakj commented 1 year ago

Also related to controlling symbol visibility: https://github.com/rust-lang/rust/issues/33221 and https://github.com/rust-lang/rust/issues/37530

We're now at a point in Chromium where we have the opposite problem too, which is covered by the above two issues, that all Rust symbols are exported when we don't want them to be (no_mangle or not).

We really need some symbol visibility control for shared libs that is independent of pub and no_mangle.

bjorn3 commented 1 year ago

Are you using the staticlib, dylib or cdylib crate type? Cdylib should only export #[no_mangle] symbols, dylib is expected to export all functions as it is meant to be used as regular dependency by other rust code. If you use staticlib see https://github.com/rust-lang/rust/issues/104707.

danakj commented 1 year ago

Are you using the staticlib, dylib or cdylib crate type?

Neither, we are building rlibs and linking them all with C++ in the final link step to produce an exe. However on Mac, the final link step produces a Framework DLL, which then has too many symbols (all the Rust symbols are exported).

Aside: We also have a component build that splits the project into DLLs, and there's potentially Rust rlibs linked into each of those, which presents many other problems (using Rust across cdylibs (essentially, but linked externally to Rust) to be addressed in the future separately.

bjorn3 commented 1 year ago

Would a version script solve the issue for frameworks? That is what rustc uses to limit exported symbols. Using symbol visibility hidden would break linking rlibs into rust dylibs, so we can't do that for the standard library.

For the component builds you can use -Cprefer-dynamic to make rustc prefer dynamically linking. This will cause libstd.so to be linked dynamically. In addition if there are any crates shared between the components you should probably either link those into a single rust dylib (you need to let rustc do the linking for this to get the required crate metadata that makes rustc recognize which crates are linked into the dylib) or compile every crate as rust dylib. Be aware however that when libstd.so is compiled with -Cpanic=unwind, all crates depending on it need to be panic=unwind too, so you may have to recompile the standard library with panic=abort if you don't already do this.

danakj commented 1 year ago

Would a version script solve the issue for frameworks? That is what rustc uses to limit exported symbols. Using symbol visibility hidden would break linking rlibs into rust dylibs, so we can't do that for the standard library.

We can give this a try.

For what it's worth, I think we would prefer to never link rust dylibs since we'll have small bits of Rust appearing amongst C++ all over the codebase for some time, and we'd be fine to not export anything from the stdlib as a result. That said we don't care about binary size in this build mode, it's for developers.

For the component builds you can use -Cprefer-dynamic to make rustc prefer dynamically linking. This will cause libstd.so to be linked dynamically. In addition if there are any crates shared between the components you should probably either link those into a single rust dylib (you need to let rustc do the linking for this to get the required crate metadata that makes rustc recognize which crates are linked into the dylib) or compile every crate as rust dylib. Be aware however that when libstd.so is compiled with -Cpanic=unwind, all crates depending on it need to be panic=unwind too, so you may have to recompile the standard library with panic=abort if you don't already do this.

We are already building our stdlib (with panic=abort). It's not uncommon for us to have some of our C++ code end up in multiple DLLs in our component build, and I would have expected Rust to be similar. Requiring every Rust library to be built as a dylib in this mode is an option I hadn't really considered as a result, since it's so dissimilar. But I don't see anything particularly wrong with that idea off the top of my head.

Thanks, I'll try things and come back.

danakj commented 1 year ago

A version script can work for Linux, an export list can work for Mac, but a DEF file does not work for Windows as it can only add exported symbols. It is possible that a much more complicated linking setup may work that generates EXP files for linking against, but fundamentally this is kinda broken, and we should not need to maintain a list of exported symbols on each platform to make this work.

After more investigation, there are different behaviours seen on Linux/Mac and Windows.

For Windows, we are only seeing #[no_mangle] symbols get exported, and this root cause is well-discussed already in https://github.com/rust-lang/rust/issues/73958 as well: That #[no_mangle] is a core tool for C/C++ FFI that also implies exported.

For Mac/Linux we are seeing all Rust symbols get exported. I don't know why Windows does not do this. I think that is https://github.com/rust-lang/rust/issues/37530?

danakj commented 11 months ago

This issue continues to be a problem for us in Chromium: https://bugs.chromium.org/p/chromium/issues/detail?id=1471542

Here the issue is another DLL that links some part of Chromium's C++ code, and now also links in some Rust code. The C++ code is compiled with -fvisibility=hidden on POSIX, so no symbols are exported unless explicitly requested. The Rust compiler has no such option currently, so everything is exported. This results in ASAN reporting ODR violations when the shared library is dlopen()ed.

On MSVC-Windows, the platform default is hidden, so symbols aren't exported there.

We would really like:

cc: @bridiver

danakj commented 11 months ago

https://github.com/rust-lang/compiler-team/issues/656 should address the first bullet here, but has yet to be implemented.

danakj commented 1 week ago

https://github.com/rust-lang/rust/pull/131519 has fixed default-hidden-visibility so that's great!

Applying it in Chromium we notice that we're still getting unexpected public symbols from Rust, but they are coming from symbols that are avoiding mangling (for reasons other than wanting to be public). But currently #[no_mangle] conflates the two things: naming and visibility. I am going to add this to the list above.

retep998 commented 1 week ago

I strongly agree that we should have a way to separate the notions of controlling mangling and controlling visibility.