rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
98.4k stars 12.73k forks source link

Ability from the top-level of the compilation not to mark #[no_mangle] items exported from shared library #73958

Open hsivonen opened 4 years ago

hsivonen commented 4 years ago

Firefox currently builds all its Rust code into a static artifact called gkrust which is then statically linked, with cross-language LTO, with C++ artifacts to form the shippable libxul.so/dylib/dll.

Upon examining the exported symbols of libxul, the FFI functions that are meant for internal glue between Rust and C++ show up.

This is a problem for two reasons:

  1. It's a binary size issue, because cross-language LTO is supposed to inline the FFI functions into their callers. However, having them exported means also keeping those copies around. Also, unused FFI functions can't be eliminated as dead code.
  2. It gives problematic third-party software more opportunities to hook into libxul in unsupported ways.

Since there's a variety of third-party crates that go into libxul, it wouldn't be practical to reannotate each FFI function. Therefore, please add a way from the top level of the Rust compilation to turn off the shared library export metadata generation for #[no_mangle] items.

RReverser commented 4 years ago

Ran into this today again, because I defined my own malloc-like allocator function and added extern "C" for the calling convention (it's meant to be passed as a callback over FFI), but since I called it malloc, it ended up being a linker error due to a clash with an actual "malloc".

For now, I can try to prefix it with something reasonably unique, but I'd really rather have private functions stay private and not have to worry about mangling their names manually.

bjorn3 commented 4 years ago

#[no_mangle] is for when your functions are not private.

RReverser commented 4 years ago

@bjorn3 Right, the situation is a bit different from this issue / my original comment it refers to, but the point is still the same in that #[no_mangle] shouldn't affect privacy but only name generation.

bjorn3 commented 4 years ago

Mangling is the way to achieve privacy. While ELF and other object formats technically have different visibility modes, those are less useful for Rust than for C. Rust splits up a single crate into multiple codegen units, so any function used in a different codegen unit from the one defining it has to have an external linkage, making it available to all object files used in the linker invocation, thus your malloc would affect all crates not just your own. In addition there is the concept of visbility in ELF, which allows you to hide symbols from the linked library, but this is still not invisible to other crates. If you want to override the memory allocator for other crates, then you should use #[global_allocator]. If you only override malloc for Rust code, that will cause inconsistencies where an explicit malloc is used in C/Rust and then free on the other end of ffi.

RReverser commented 4 years ago

which allows you to hide symbols from the linked library, but this is still not invisible to other crates

This is all I want (as well as what @hsivonen describes). I want to be able to expose functions to the C library that is statically linked in as part of build.rs, but I don't want them to end up being publicly visible out of my final artifact.

bjorn3 commented 4 years ago

If you only override malloc for Rust code, that will cause inconsistencies where an explicit malloc is used in C/Rust and then free on the other end of ffi.

This problem extends to overriding it for both Rust code and statically linked C code without also overriding it for dynamically linked C code like libc.

luser commented 4 years ago

To speak to what we did in C/C++ for Firefox, we used the -fvisibility=hidden compiler flag to default all symbols to hidden visibility, and then explicitly annotated public symbols with __attribute__ ((visibility ("default"))). We also implemented a system to generate wrappers for system headers whose declarations do not contain visibility annotations. The wrapper headers boil down to basically:

#pragma GCC visibility push(hidden)
#include_next <real_header>
#pragma GCC visibility pop

To @hsivonen's point—the Rust code in Firefox needs to provide a C API to the C/C++ code for FFI purposes, and #[no_mangle] is a part of that. Separating the concerns of name mangling and symbol visibility sounds useful so that in situations like this where the Rust compiler is only producing an intermediate static library that gets linked into the final binary, developers can fine-tune the behavior they need. Additionally, providing solutions that allow overriding behavior is useful because otherwise it's one more thing you need to audit in your dependency graph, and crate authors can't all be expected to understand the nuance here.

bjorn3 commented 4 years ago

I think this is basically what #[rustc_std_internal_symbol] does.

Manishearth commented 2 years ago

It seems like a useful path forward would be to split #[no_mangle] into #[no_mangle(private)] and #[export_symbol] (alternatively, make #[used] apply to functions), and have #[no_mangle] imply both.

Something similar can be done with #[export_name], it feels strange that symbol renaming is only available via methods that export the symbol.

But also I think there should be a codegen flag for this.

bjorn3 commented 2 years ago

Isn't #[no_mangle(private)] basically #[rustc_std_internal_symbol]? It doesn't export the symbol from cdylib's while still disabling symbol mangling.

By the way why are you using #[no_mangle] if you don't want the symbol to be exported from the cdylib? The standard library needs this to break some cyclic dependencies, but I don't see any reason why regular user code needs it.

hsivonen commented 2 years ago

By the way why are you using #[no_mangle] if you don't want the symbol to be exported from the cdylib?

  1. In the Firefox case, the Rust code is compiled into a static lib that gets linked into a larger dylib that also contains C++ code. The Rust code needs to be visible to the C++ code during the static linking step but shouldn't remain visible from the combined dylib.
  2. #[no_mangle] is the idiomatic way how crates on crates.io provide FFI, so the Rust top-level compilation (the omnibus Rust static lib that gets linked with C++) doesn't get to control how the crate ecosystem annotates the things that are currently annotated #[no_mangle], which is why there's a need to be able to control the behavior of #[no_mangle] as it appears across dependencies from the top-level config.
bjorn3 commented 2 years ago

In the Firefox case, the Rust code is compiled into a static lib that gets linked into a larger dylib that also contains C++ code. The Rust code needs to be visible to the C++ code during the static linking step but shouldn't remain visible from the combined dylib.

For this rustc can't do anything as it doesn't control the linker invocation. If it links a cdylib itself it passes a list of all symbols to export to the linker. You did have to emulate this in firefox.

[no_mangle] is the idiomatic way how crates on crates.io provide FFI, so the Rust top-level compilation (the omnibus Rust static lib that gets linked with C++) doesn't get to control how the crate ecosystem annotates the things that are currently annotated #[no_mangle], which is why there's a need to be able to control the behavior of #[no_mangle] as it appears across dependencies from the top-level config.

I did recommend putting the part of the library that exports the C api that wraps the rust api in a different crate from the one exporting the rust api. If you don't want the C api to be exported you can then simply depend only on the crate with the rust api.

hsivonen commented 2 years ago

[no_mangle] is the idiomatic way how crates on crates.io provide FFI, so the Rust top-level compilation (the omnibus Rust static lib that gets linked with C++) doesn't get to control how the crate ecosystem annotates the things that are currently annotated #[no_mangle], which is why there's a need to be able to control the behavior of #[no_mangle] as it appears across dependencies from the top-level config.

I did recommend putting the part of the library that exports the C api that wraps the rust api in a different crate from the one exporting the rust api. If you don't want the C api to be exported you can then simply depend only on the crate with the rust api.

This doesn't solve the problem that this issue is about: hiding symbols in the resulting shared library formed by linking a static lib generated by Rust compilation with some other object code.

E.g. encoding_rs and encoding_c are a Rust crate and its C API crate. Since Firefox needs the C API, it builds encoding_c as part of its omnibus Rust artifact. Once that's linked with C++ object code to form libxul, having the C API be visible from outside libxul is unwanted.

bjorn3 commented 2 years ago

Once that's linked with C++ object code to form libxul, having the C API be visible from outside libxul is unwanted.

The only way to hide those symbols when linking a rust staticlib into a dylib is to pass the linker a list of symbols to hide. This is not something rustc has any control over. If rustc were to do the linking it does this for you, but in the case of libxul rustc doesn't do the linking and as such the build system that invokes the linker is responsible for making sure the symbols are hidden.

luser commented 2 years ago

My previous comment here explained the Firefox solution for C/C++, where we encountered similar issues with third-party code.

bjorn3 commented 2 years ago

Right, hidden visibility doesn't work for rust as we don't yet know in advance if a crate ends up in a rust dylib or cdylib. In the former case hidden visibility is incorrect as a downstream crate may use a function marked with hidden visibility (for example because another function got inlined), while for a cdylib this is not the case. Having a global compiler flag that disables linking as rust dylib only solves part of the issue as the standard library will still not get hidden visibility and libstd is compiled as rust dylib too, so trying to use it with -Zbuild-std will error out.

sffc commented 2 years ago

The other issue, as raised in https://github.com/rust-lang/rust/issues/104130, is that we don't want to decide on which symbols to actually link in the cdylib until link time. The alternative is to remove unused APIs via cfgs or similar, but this doesn't scale when there are 500 exported symbols (like in ICU4X). A better workflow is to pass the list of desired symbols to export directly to the linker, but this only works if the symbols have non-default visibility.

bjorn3 commented 2 years ago

A better workflow is to pass the list of desired symbols to export directly to the linker, but this only works if the symbols have non-default visibility.

Rustc passes the list of symbol to link to the linker and this works just fine despite using default visibility. Rustc basically says make everything hidden except this specific list of symbols.

https://github.com/rust-lang/rust/blob/0aaad9e757207657064d70bf9e6c1e6eb327bf15/compiler/rustc_codegen_ssa/src/back/linker.rs#L694-L709

sffc commented 2 years ago

I can't speak for ld, but for wasm-ld I've found as in #104130 that even if I filter the list of exported symbols passed to the linker, all explicitly no_mangle symbols are still found and re-exported by the linker. The mechanics for this are explained in the wasm-ld docs page.

Manishearth commented 2 years ago

@bjorn3 This does not work fine, see #104130, where it's necessary to be able to configure that.

I feel like this issue has had sufficient people with different real-world use cases justifying the need for this, can we move past questioning the need? A lot of these use cases come from complex projects with complex properties, and one thing I've noticed from my years of working on large codebases is that almost always when a large codebase wants feature X, people will suggest a laundry list of things to try instead, and they either won't work or suffer from "I bet that almost works" syndrome

I think the stdlib point is a good one and I suspect https://github.com/rust-lang/rust/issues/104130 paired with a solution here is a good way to fix that.

bjorn3 commented 2 years ago

I can't speak for ld, but for wasm-ld I've found as in https://github.com/rust-lang/rust/issues/104130 that even if I filter the list of exported symbols passed to the linker, all explicitly no_mangle symbols are still found and re-exported by the linker. The mechanics for this are explained in the wasm-ld docs page.

That is a wasm specific bug where we fail to hide symbols that are not in the rustc generated list of symbols to be exported. On Windows, macOS and any ELF based system the rustc can pass a specific list and the linker respects this without linking anything else AFAIK.

@bjorn3 This does not work fine, see https://github.com/rust-lang/rust/issues/104130, where it's necessary to be able to configure that.

That is a different use case from the firefox one. In #104130 rustc is invoking the linker and proving an option when compiling the cdylib to only export specified symbols is realistic, albeit requiring a wasm-ld change to allow rustc to hide all symbols except those passed in through --export given that even wasm-strip+wasm-opt can't reclaim the extra size, I am now convinced that having this option for cdylib crates is fine.

The firefox use case is not something rustc can fix as it doesn't control the linker.

I feel like this issue has had sufficient people with different real-world use cases justifying the need for this, can we move past questioning the need?

I don't question the need, but I think that rustc can't provide the solution for the staticlib crate type, only for the cdylib crate type. For the staticlib crate type I think the only one that can provide a full solution is the thing that invokes the linker by specifying an explicit list of symbols to export. Sure rustc could provide a partial solution using hidden visibility for user crates, but if I understand the post you linked correctly, that is an "I Bet That Almost Works" solution.

Manishearth commented 2 years ago

That is a different use case from the firefox one

Yes, I understand, I'm highlighting that there are different use cases for the same feature from varied backgrounds.

Manishearth commented 2 years ago

, I am now convinced that having this option for cdylib crates is fine.

Thanks! Do you have ideas as to what is the best path forward here? @sffc can correct me if I'm wrong on this, but we can either have:

It seems like you're leaning towards option 3?

bjorn3 commented 2 years ago

I think wasm-ld should get a flag that makes it default to hiding all symbols and rustc should use this unconditionally even if none of the suggested flags about controlling #[no_mangle] exports are used. This makes rustc in control of the list of symbols to be exported by cdylib's just like it is on every other platform.

In addition we should get a way to allow the user to control the list of symbols exported from a cdylib that works on any platform. I'm not sure if it should be in the form of an allowlist, in the form of a denylist or if both should be supported. I think implementing it in the form of an allowlist would be best as it avoids accidentally exporting symbols you didn't mean to even as you update dependencies, but you probably know better what works than me.

Manishearth commented 2 years ago

I guess the wasm-ld stuff occurs upstream, right?

bjorn3 commented 2 years ago

Indeed. It is part of LLVM.

sffc commented 2 years ago

I found some new info:

https://github.com/rust-lang/rust/blob/e75aab045fc476f176a58c408f6b06f0e275c6e1/compiler/rustc_target/src/spec/wasm32_unknown_unknown.rs#L41

rustc is passing --export-dynamic to wasm-ld:

--export-dynamic: When building an executable, export any non-hidden symbols. By default only the entry point and any symbols marked as exports (either via the command line or via the export-name source attribute) are exported.

If I manually remove the --export-dynamic from the wasm-ld invocation, then I get my expected behavior where I only get the symbols listed in --export arguments.

Related: https://github.com/rust-lang/rust/commit/ed5614599a1758db1ae412d8cc4af5887de18fe1

Manishearth commented 2 years ago

Hmm, ideally that flag would be manually set via link-args but given that it's automatically set perhaps we need another flag to turn it off?

sffc commented 2 years ago

~So there is --no-export-dynamic, but it seems that the presence of --export-dynamic earlier in the call chain takes precedence.~ (EDIT: actually this works fine; --no-export-dynamic can turn off a previous --export-dynamic.) However, it doesn't really address the issue that in Rust we really should have a way to stop all the --export arguments from being added automatically.

If rustc changed such that I could turn off both the --export and the --export-dynamic flags, then I think this issue could be resolved

RReverser commented 2 years ago

then I think this issue could be resolved

You're probably referring to https://github.com/rust-lang/rust/issues/104130 not to this issue, as it wouldn't be resolved by Wasm fix alone.

sffc commented 2 years ago

then I think this issue could be resolved

You're probably referring to #104130 not to this issue, as it wouldn't be resolved by Wasm fix alone.

Indeed.

danakj commented 1 year ago

It would seem that any FFI bindings tool is going to end up putting all its FFI functions into the exported symbol list due to this conflated use of #[no_mangle] causing export.

We're using CXX in Chromium, and it does the following:

We don't want to export all FFI functions from our chrome.dll, or our Mac framework. We can technically use an export list to keep them from being exported on Mac, but Windows linker does not provide the same affordance, so things are much more complicated there.

So we really need FFI functions to not be part of the chrome.dll export list. Right now it looks like this, but with many many more.

        145   90 0AA21FD0 cxxbridge1$std$weak_ptr$f64$clone = cxxbridge1$std$weak_ptr$f64$clone
        146   91 0AA22010 cxxbridge1$std$weak_ptr$f64$downgrade = cxxbridge1$std$weak_ptr$f64$downgrade 
        147   92 0AA220C0 cxxbridge1$std$weak_ptr$f64$drop = cxxbridge1$std$weak_ptr$f64$drop
        148   93 0AA21FB0 cxxbridge1$std$weak_ptr$f64$null = cxxbridge1$std$weak_ptr$f64$null
        149   94 0AA22050 cxxbridge1$std$weak_ptr$f64$upgrade = cxxbridge1$std$weak_ptr$f64$upgrade     
        150   95 0AA21390 cxxbridge1$std$weak_ptr$i16$clone = cxxbridge1$std$weak_ptr$i16$clone
        151   96 0AA213D0 cxxbridge1$std$weak_ptr$i16$downgrade = cxxbridge1$std$weak_ptr$i16$downgrade 
        152   97 0AA21480 cxxbridge1$std$weak_ptr$i16$drop = cxxbridge1$std$weak_ptr$i16$drop
        153   98 0AA21370 cxxbridge1$std$weak_ptr$i16$null = cxxbridge1$std$weak_ptr$i16$null
        154   99 0AA21410 cxxbridge1$std$weak_ptr$i16$upgrade = cxxbridge1$std$weak_ptr$i16$upgrade     
        155   9A 0AA216A0 cxxbridge1$std$weak_ptr$i32$clone = cxxbridge1$std$weak_ptr$i32$clone
        156   9B 0AA216E0 cxxbridge1$std$weak_ptr$i32$downgrade = cxxbridge1$std$weak_ptr$i32$downgrade 
        157   9C 0AA21790 cxxbridge1$std$weak_ptr$i32$drop = cxxbridge1$std$weak_ptr$i32$drop
        158   9D 0AA21680 cxxbridge1$std$weak_ptr$i32$null = cxxbridge1$std$weak_ptr$i32$null
        159   9E 0AA21720 cxxbridge1$std$weak_ptr$i32$upgrade = cxxbridge1$std$weak_ptr$i32$upgrade     
        160   9F 0AA219B0 cxxbridge1$std$weak_ptr$i64$clone = cxxbridge1$std$weak_ptr$i64$clone
        161   A0 0AA219F0 cxxbridge1$std$weak_ptr$i64$downgrade = cxxbridge1$std$weak_ptr$i64$downgrade 
        162   A1 0AA21AA0 cxxbridge1$std$weak_ptr$i64$drop = cxxbridge1$std$weak_ptr$i64$drop
        163   A2 0AA21990 cxxbridge1$std$weak_ptr$i64$null = cxxbridge1$std$weak_ptr$i64$null
        164   A3 0AA21A30 cxxbridge1$std$weak_ptr$i64$upgrade = cxxbridge1$std$weak_ptr$i64$upgrade
        165   A4 0AA21080 cxxbridge1$std$weak_ptr$i8$clone = cxxbridge1$std$weak_ptr$i8$clone
        166   A5 0AA210C0 cxxbridge1$std$weak_ptr$i8$downgrade = cxxbridge1$std$weak_ptr$i8$downgrade   
        167   A6 0AA21170 cxxbridge1$std$weak_ptr$i8$drop = cxxbridge1$std$weak_ptr$i8$drop
        168   A7 0AA21060 cxxbridge1$std$weak_ptr$i8$null = cxxbridge1$std$weak_ptr$i8$null
        169   A8 0AA21100 cxxbridge1$std$weak_ptr$i8$upgrade = cxxbridge1$std$weak_ptr$i8$upgrade       
        170   A9 0AA22640 cxxbridge1$std$weak_ptr$isize$clone = cxxbridge1$std$weak_ptr$isize$clone     
        171   AA 0AA22680 cxxbridge1$std$weak_ptr$isize$downgrade = cxxbridge1$std$weak_ptr$isize$downgrade
        172   AB 0AA226D0 cxxbridge1$std$weak_ptr$isize$drop = cxxbridge1$std$weak_ptr$isize$drop       
        173   AC 0AA22620 cxxbridge1$std$weak_ptr$isize$null = cxxbridge1$std$weak_ptr$isize$null

For clarity, there is a separate issue on Linux/Mac of all Rust symbols ending up in the DLL export list, but these are not #[no_mangle] so I think that should be tracked elsewhere (https://github.com/rust-lang/rust/issues/37530 ?).

Chromium bug for this, which blocks Windows: https://bugs.chromium.org/p/chromium/issues/detail?id=1462356

amyspark commented 8 months ago

Hi all, I found that this is a big issue when attempting to build the pixbufloader_svg component of librsvg: https://gitlab.gnome.org/GNOME/librsvg/-/merge_requests/925

For the time being, I'm relying on manually link_whole ing the library into a shared version with a list of exports. (Alternatively, for macOS, single-object prelinking can be done to the static library, or stripping for Linux.)

anforowicz commented 1 week ago

To speak to what we did in C/C++ for Firefox, we used the -fvisibility=hidden compiler flag to default all symbols to hidden visibility

Chromium does something similar for C++. And after https://crrev.com/c/5966273 Chromium gives Rust similar treatment by using -Z default-visibility=hidden (see https://github.com/rust-lang/rust/pull/118417 and https://github.com/rust-lang/rust/pull/131519).

So it seems that this issue will be fixed if we can agree on and implement #[no_mangle(private)] (proposed above/earlier when discussing this issue). If so, then I guess the next step would be to open an RFC with such proposal? (I couldn't find an existing RFC. I assume that an MCP wouldn't work here, because this seems to be an end-user-facing change that is not just a new compiler flag.)

bjorn3 commented 1 week ago

For the standard library we have #[rustc_std_internal_symbol] to prevent symbols from leaking out of cdylibs. This intentionally doesn't use hidden visibility however as we do want to export them from dylibs in case the caller and callee are in different rust dylibs. Instead we are using a version script with for cdylib omits all symbols marked as #[rustc_std_internal_symbol] and in case of dylib does include them as exported. It also doesn't work with staticlibs (https://github.com/rust-lang/rust/issues/104707).