bazelbuild / bazel

a fast, scalable, multi-language and extensible build system
https://bazel.build
Apache License 2.0
22.99k stars 4.03k forks source link

Cross-Platform Control of Symbols Exported by a Shared Library #13285

Closed cpsauer closed 1 year ago

cpsauer commented 3 years ago

Description of feature request:

In going with Bazel's cross-platform goals and the general magic of semantically expressing the build you want and having it work across platforms:

It'd be awesome if cc_library rules had a cross-platform attribute for controlling which symbols are exported from a shared library, forming its binary interface.

I'm imagining the best interface would be an attribute on cc_library, where you specify a list of globs, and symbols are exported iff they match one of those globs. (more on why at the end)

Feature requests: what underlying problem are you trying to solve with this feature?

It's often important to control which symbols from a shared library. In addition to defining a clean interface, it's important to let the linker know about the interface to strip out dead code. Also, it's easy to end up with symbol conflicts and nasty bugs if you leak a bunch of symbols, and, e.g. accidentally (privately) statically depend on a different version of a library than another library in the same process. Bad times.

I'm working on a cross-platform project (part of why Bazel is a good fit!) and wrote platform-specific code to implement this. From the issues, online discussion, and existing Windows DEF files support in cc rules, it looks like there's a solid amount of interest in being able to control the interface of shared libraries. This feature request is about making that functionality cross-platform instead of platform specific.

The visibility attribute (or DLL export type annotations) approach to controlling symbols typically fails, in my experience, because libraries you depend on (e.g. Boost) often set their own interfaces as default visible, which means you end up exporting their whole interface, with no easy fix. The idea of annotating code as "interface" isn't a great one anyway, since good code contains many nested interfaces. One man's interface is another's implementation. Better to succinctly describe your top-level interface.

Lists of symbols to export (or not export) seems to be the best way to succinctly describe that interface. Globs are important for handling mangled C++ symbols and namespacing (C++ or as C-style symbol prefixes). Apple platforms and Android/Linux have converged around different, but equivalent ways of doing that (-exported_symbol vs --version-script ldscripts). Both are easily generated from a list of symbol globs. I don't know about Windows yet, but can imagine generating a DEF file from glob over the symbols list? I feel pretty confident in the Apple/Android/Linux solution I wrote in Starlark and would be happy to share--less confident in my understanding of the Bazel native cc rules and Windows.

Thanks for your consideration, Chris

aiuto commented 3 years ago

I see more than one request here. The first is simply to have a cc_library shared library portably name symbols to export. That seems obviously useful. I think this could be enhanced by having direct dependents on such a library be able to further specify the symbols they require from the library - limited to a subset of the symbols exported. At link time, we would filter down to the union of symbols requested, so we could do a link time optimization of the .so based on what the application actually needs.

I've been thinking of a similar problem that I'm facing. We have some libraries (sqlite for example) that we build with different sets of defines for different application specific needs. The binary interface to the library is identical each time, but the capabilities of the SQL language provided vary. I can not use select() to change behavior by platform - it is only the direct users of the library which know what set of capabilities they need from behind a single ABI. It must be the union of all the required capabilities through every path to the library that determines, at link time, the specific implementation.

cpsauer commented 3 years ago

Very interesting! Thanks for thinking carefully about it, @aiuto.

I think you might be able to get the first enhancement you mention basically for free given symbol name control. Different libraries could declare different subsets of the interface to export and (since the command line flags for linux/mac are chain-able and propagated to the linker) you could depend on the subsets you need and have everything work as you'd like.

The define-based capabilities part seems thornier, as you say. And propagating the union needs up the dependency graph at its core

cpsauer commented 3 years ago

@nicmcd, I saw your post in my email, and then not, here, so I'm assuming you deleted it. Did you get things working okay?

[Googled your name to find your Github username. Cool stuff you're working on!]

nicmcd commented 3 years ago

@cpsauer, yes I got it figured out. I've slowly been making a library of BUILD files that take open source software and bazelize them (https://github.com/nicmcd/pkgbuild). numactl required some extra hacking :). An example of using it is here: https://github.com/nicmcd/numaex. I sent you a linkedin request. Let me know if you want in on any of the fun!

oquenchil commented 3 years ago

We have an experimental cc_shared_library implementation here.

It was a decision during the design process for neither the rule nor Bazel in any other way to get involved in control of symbols like you describe. It is very difficult for Bazel to get this right in a cross-platform way without the right tools and the tooling in fact does not exist. If in the future, linkers offer a way to do this, we'd be happy to hook it up to the cc_shared_library implementation.

For now, there is nothing to do here.

cpsauer commented 3 years ago

Hey @oquenchil! Thanks for reading and thinking about it.

But hmm, I'd love to learn a little more about what you guys ran into that made you conclude it couldn't be done. I feel pretty confident that linkers do standardly offer a way to do this, having just written a macro to do it for all the platforms we use--and helped a couple others get Bazel to do the same! Didn't take that long, and there seems to be a real need here, per the conversations with @aiuto, @nicmcd, and others on other threads. Plus, it looks like Bazel is already involved in doing it for Windows with the Windows-specific DEF file parameter mentioned above. Is there something I'm missing? Or something you guys ran into down the line that blocked it?

[Re cc_shared_library: I'm assuming from reading the source that it's both an output and an internal rule to produce shared native libraries for the platform. I'm having a little trouble grokking exactly what exports_filter does without docs or comments, but is that what you were pointing me towards? Could this be used inside of platform-specific rules, like an android_binary, ios_framework, etc, or would we need something on cc_library for that? Thanks for linking it though; I had no idea cc_shared_library existed!]

oquenchil commented 3 years ago

Maybe we are not referring to the same thing. What we concluded that was out of the scope of Bazel without the right tools is accepting unmangled symbol names via a rule's attribute which then depending on the platform get used to write proper version scripts, def files and so on. As far as I know the mangled names are not standardized and I couldn't find tools to do this.

Another problem was the granularity if you want to list targets instead of specific symbols, you might want to say export all the symbols from this cc_library but quite often that is wrong. You have to carefully look at what your ABI should be and not indiscriminately export all the symbols from every object file in the target. Automatically writing a version script that gets all of that right is not trivial as far as I understand.

But I'm definitely curious about what you have come up with. If you have a rule that is writing these files correctly depending on the platform, then you can try cc_shared_library and the linkopts + additional_linker_inputs attributes with selects() to pick the right file produced by your rule. If this looks promising, it can eventually be merged.

Regarding every mention of exports in cc_shared_library, keep in mind that Bazel is not affecting the symbols exported at all. The exports attribute are used so that each library owner announces which other targets it exports. This is then used later to give errors when for example two shared library are exporting the same target. It is left as the responsibility of the library owner to get the actual symbol exports in the linked artifact correctly by using whatever visibility mechanism their linker provides.

cc_shared_library can be depended on from a cc_binary. I can imagine it being used by platform specific rules as well once the experimental flag is removed.

cpsauer commented 3 years ago

Oh! Hmm, I think you might find that globs are an easier path than mangling symbols, but I still have in mental cache how you do this with version scripts, if that's the path you wanted. There's an extern "C++" block that handles it all for you--see this StackOverflow post for a nice example. And in case it's useful, it seems like things might be standardized around the Itanium C++ ABI in the GCC compatible world--if you needed something callable, maybe clang's mangler, also on Apple would do the trick?

As per the original post, I think globs are really the way to go, though. They dodge this issue (just wildcard around the human readable name, and you avoid the whole mangling thing for free!). They're also the "native" format accepted by the tools on the platforms I looked at (no accident), so they're easy-peasy to implement. And you can both concisely express a whole collection of symbols, and still fall back as needed to specifying the full symbol. They also don't couple things to C++ specifically. I totally agree with you that the target-based thing isn't likely to express what a user would like; presumably they'd like to export an interface, not a target that likely contains implementation. If we wanted to move up to the source code level--as opposed to the symbol level--maybe the right unit of symbols to export would be an interface, i.e., a header or collection of headers?

[I hadn't actually needed to mangle because I'd specified C-linkage for interoperability with other languages, because I wanted nice symbol/interface names, and because C++ interfaces suffer from the Fragile Binary Interface Problem (explanation I liked). I just remember the above from reading through the docs. Maybe I'm wrong, but I'd guess c-linkage and non-mangled symbol names would be the majority of shared-library symbol-export control. As for my implementation, I'm literally just selecting on platform information as you say, and passing the glob-based flags from my original post. But I'm using cc_library, since cc_shared_library isn't released yet.]

Is there any chance you'd be down to reopen the issue? It seems like there's user demand, unmodeled important behavior across OSs, and ways to do it, even if the original implementation path didn't pan out?

oquenchil commented 3 years ago

So cc_libraries do not produce a proper shared library with their dependencies linked in (only on Windows and I think @meteorcloudy told me once this is not needed). The shared libraries created by cc_libraries are NODEPS dynamic libraries which are an optimization for running tests faster.

There are things I'm still not seeing. You said that the list of symbols is given via an attribute, after that, how are they passed to the linker exactly on each platform?

Can you provide examples for the main platforms of how build files would look like and how the actual rule implementation would be like?

I'm still skeptical that you can just take in a list of symbols and automatically create the right files to pass to the linker in a cross-platform way. Right now it's not that it's not possible, you can write a cross-platform rule, it's just that you'd have to write the version scripts, def files, etc yourself. This also gives you the finer control I believe most people would want for their version scripts.

Also I think your solution can already be used by those people that totally do not care about this fine control. You can use it with cc_shared_library today without further changes to cc_shared_library, just create a new rule in Starlark that writes these files differently depending on the platform and add those targets to additional_linker_inputs (with their corresponding user_link_flags).

I'm hesitant of signing up cc_shared_library to be the one responsible for automatically writing these files because how likely it is that most of the time it won't behave as expected by the user.

cpsauer commented 3 years ago

Sure! Happy to make a concrete example of what the macro would spit out as a proof of concept. Easy peasy for the symbols list case you ask about--since that's directly what the linker platform tooling takes, with just a little wrapping.

Let's say you wanted to export symbols SYMBOL1 and SYMBOL2. The idea is to have a parameter where you'd just specify, e.g., exported_symbols = ["SYMBOL1", "SYMBOL2"] and be done. I rolled this as a macro for ease of a quick workaround. We'll say the rule's name is API.

We'll describe the cases for each OS (just select() on platform as soon as platforms land/work on their desired platforms, and proxies until then): For Android/Linux, you'd generate an version script. I baked a file using write_file from skylib. My macro would run:

write_file(
    name = "API_ldscript",
    out = "API.ldscript", # Linker script suffix required to be ld, lds, or ldscript by cc_library
    content = [
        "{",
        "   global: SYMBOL1; SYMBOL2;", # Just dump em by "; ".join'ing the list
        "   local: *;", # All other symbols hidden by default.
        "};"
    ],
)

Then you depend on this from the cc_rule. So add "API_ldscript" to deps, and pump "-Wl,--version-script,$(location API.ldscript)" to linkopts.

For Apple platforms, it's even easier: Just plumb the symbols through to the linkopts line: + ["-Wl,-exported_symbol,_SYMBOL1", "-Wl,-exported_symbol,_SYMBOL2"] Note auto-adding the leading underscore that Apple adds to symbol names on their platforms.

^ One sweet thing is that both those tools support globs, so that all just works, in the common case that you're prefixing your symbols with something to get namespacing-like behavior in c--or working around name mangling.

Windows, I'm DEFinitely no expert, but I'll google it live :) Fairly confident it'd work with DEF files, since they're basically just lists of symbols, yeah? So you could take a parallel approach to Android/Linux by writing a DEF file, but probably easier to just specify as linker flags. ["/EXPORT:SYMBOL1", "/EXPORT:SYMBOL2"]


My take is that this wraps enough of a hassle for users to write that Bazel should model and wrap it, rather than leaving everyone to re-roll their own--like other platform differences--but it's definitely not super onerous to implement. This seems like it'd be on the simpler end of OS-flag differences that Bazel wraps! Certainly that's better than supporting only Windows DEF files, like the current cc_library, right? And in the event a user needs something crazy and platform-specific, they can always fall back to patching in platform specific flags, as they'd do right now--this wouldn't block that.

Offhand I still think we'd want this on cc_library, so whatever platform-specific rule bundles up the library for distribution would get the right interface. Maybe that's a cc_shared_library, but it could easily be an android_binary, an macos_bundle, an ios_framework, etc. But if I'm being thick somewhere here (or elsewhere!) feel free to say so :)

oquenchil commented 3 years ago

What I assumed that you were referring to is that we would have a rule that just takes the string list of demangled symbols Foo, Bar, Baz and writes the correct file (be it def file, version script, etc..) depending on the platform. I understood you would do this in such a way with globs so that the file written is correct regardless of how each platform mangles names (taking into account parameter types and everything else). I don't think you can do this correctly in a crossplatform way.

I actually had a go at this a long time ago in early versions of cc_shared_library using c++filt. The conclusion was that I would wait for proper tooling to do this if it ever comes. Again, this is coupled with the fact that most of the time people want much more complex version scripts. A solution that only works half the time is not something we desire to support since the maintenance cost of answering why the other half of the time it doesn't work would be prohibitive. Of course you are welcome to have this in your own repository, if I'm wrong and we can see long term that the community really wants this, then we'd consider merging it.

However, that example target you wrote just seems to have the whole contents of the script for Unix? At that point I'm not sure what useful work the rule is doing compared to just providing the raw file like in the example tests of cc_shared_library.

Transitive linking will never be properly supported in cc_library (the support on Windows is just a hack that should have gone away already). That's what cc_shared_library will be for. The reason for that is that a lot of heavy work is required in analysis and we don't want cc_libraries in the middle of the graph doing this.

cpsauer commented 3 years ago

Hey @oquenchil! Getting back to this now that the regular workweek is done.

You'd pointed out early on that we might be talking about different things. I think you were right; I wish I'd listened and figured that piece out sooner. It seems like we've diverged a bit here, both from each other and the original setup. I'm gonna try to pull us back together and to make sure we aren't talking past each other :) I really appreciate your engagement on this.

I think the key thing is merging our definitions of "symbol." I meant symbol not in the sense of a C++ function name, but in the sense of the assembly-level, language-independent, ABI of a shared-library.

cpsauer commented 3 years ago

It seems like we both care about controlling the interface of a shared library, you having poured what sounds like a fair amount of blood and tears into trying to control it via C++ names. (As do others, it seems: @aiuto, and the ~7 other people around the web that I've helped on this since filing.) Checks out; it's a core feature across OS's for a reason, and there's an opportunity for something cleaner here for sure.

I totally get why you were previously keen on filtering unmangled C++ names, and I totally get why that was painful. And while I was happy trying to brainstorm a way to make that work if you wanted--perhaps we should just sidestep the pain there, in favor of the original goal of direct symbol control?

Some benefits of filtering symbols instead of C++ names are the following: symbols are what're actually being controlled at the level of the linker and shared object binary; it's easy to do, obviously correct, and sidesteps the pain you had; a mangled C++ interface is less likely to be a good idea anyway because of instability and language interoperability; and globbing would tend to give you more bang for less buck than mangling anyway, in my experience.

[I still haven't totally given up on finding non-horrible ways to do the C++ names approach if that's what you wanted, but I think that's besides the point. Sounds like you're ready to move on from the C++ names approach, and that's not what I was asking about originally anyway.]

oquenchil commented 3 years ago

I think we were talking about the same thing but maybe I was using the wrong terms. I'm actually not sure what the difference is between C++ names inside an already linked shared library and symbol names. I did want to refer to the symbols that are "assembly-level, language-independent, ABI of a shared-library.".

My understanding is the following: different platforms will produce different symbols when compiling source code (this is the non-standard part I'm referring to); if you want to write a rule in a BUILD file which takes in a list of symbols that should be exported you would have to do it using the original source code that created that symbol as representation, you cannot just list the already mangled symbols in the attribute because it won't be compatible across platforms. Converting the representation listed in the rule attribute to the actual symbol for that platform is the hard part.

I'd be happy if you told me that there is indeed a standard and if you could point me to it. Then I'd agree that we should do this. If you told me that you need a separate list of symbols for all platforms, then I'd tell you that you can already do that with a select() and cc_shared_library.

cpsauer commented 3 years ago

Ah! Yeah, so to make sure we're on the same page: By the time we get to the ABI of a library, we're dealing with symbols. Names are the source-code level construct for the implementation language.

[Some languages' compilers, like C++, mangle names by default as a way of generating the symbols that correspond to those names. But C (also of cc_library), for example, doesn't mangle names...it just uses the clean name directly as the symbol with no mangling. We can't assume that cc_libraries necessarily produce mangled, platform-specific symbols!]

cpsauer commented 3 years ago

Indeed, luckily, the exported symbols in a (cross-platform) shared library usually aren't mangled. Instead, they're the same across platforms and the same as the function name--just as they are when implemented in C or assembly [1].

There are a few reasons it's more common to have the assembly-level, language-independent, ABI of a shared-library use unmangled names as symbols. To name a couple: The name mangling we're talking about is a C++-specific thing, so being able to call the library from other languages requires having clean, unmangled symbol names, as would be created from C or assembly. A second reason is that mangled C++ interfaces tend to suffer from the Fragile Binary Interface Problem (as above).

So usually, if you're shipping a cross-platform shared library, and you're implementing it in C++, you wrap the interface in an extern "C" {} block to turn off the mangling. Or if you're writing the interface in C, then your names are already not mangled when being converted into symbols. Hence this request being about controlling symbols, rather than C++ names! And the good news is that this is easy! And sidesteps the platform-specific pain you're describing.


Notes: [1] At least up to a leading underscore, omitted on Linux, but that's easy enough. [Agree C++ mangling management is gnarly; not asking for it. If you did want it--and there are sometimes good reasons to have mangled, C++ interfaces!--there is indeed a mangling standard (Windows separate from non-Windows) and also open source code from clang to back the standards up, as linked earlier in the discussion. I still think controlling symbols, not C++ names, is the way to go, though. And if adding hard functionality, rather than supporting mangling, I'd still probably recommend globbing or listing interface headers.]

oquenchil commented 3 years ago

Alright, I'm convinced now this sounds more generally useful than I thought at first. This will still need to be incorporated to cc_shared_library though, not cc_library.

cpsauer commented 3 years ago

Yay! Thanks for your patience. Happy also to help if you'd like.

[As for which rules: whichever works. My take is just that it's important the shared library be useable across platforms, including being able to be depended on by whichever bundling rules the platform requires. So like android_binary, ios_framework, etc.. And from what you said, it sounds like that's in the cards for cc_shared_library. Do we have a timeline for cc_shared_library? I assume you want this as an incremental change after it lands.]

oquenchil commented 3 years ago

Thank you for your input too.

So the story with cc_shared_library being experimental is that I wanted a major project to validate it before removing the experimental flag. The person who was migrating Tensorflow to using the rule left and no one took over that. Currently, Roboleaf (moving AOSP to Bazel) is using it so that should be enough to validate it. At the same time any other projects using it and reporting problems would be helpful.

To give you a timeline I'd say I will have enough of a signal in Q3 to feel confident about removing the flag. But I'd very much appreciate it if you tried it yourself and gave me feedback too.

github-actions[bot] commented 1 year ago

Thank you for contributing to the Bazel repository! This issue has been marked as stale since it has not had any activity in the last 1+ years. It will be closed in the next 14 days unless any other activity occurs or one of the following labels is added: "not stale", "awaiting-bazeler". Please reach out to the triage team (@bazelbuild/triage) if you think this issue is still relevant or you are interested in getting the issue resolved.

cpsauer commented 1 year ago

@bazelbuild/triage, this issue still relevant and I think would be very valuable, making it much easier to build distributable native libraries with bazel!

(@bazelbuild/triage doesn't seem to create a tag (no link) so I'm going to also tag @sgowroji and @Pavank1992 manually. Please coach me if you'd have preferred otherwise--and maybe update the bot's instructions)

oquenchil commented 1 year ago

Hey Chris, in any case for the time being we don't want this functionality as part of the set of C++ rules (we don't have the bandwidth). I started a personal external repository with some tools related to cc_shared_library that could be useful for the community but that won't be part of Bazel and which the Bazel team won't be maintaining (see here). The hope would be that they are interesting enough for someone to offer being the maintainer and then they could be added to https://github.com/bazel-contrib.

I myself won't be maintaining those tools for long apart from initial bug fixes. The functionality described here in this issue could be added as a separate tool/rule that is run separate to the cc_shared_library rule logic.

https://github.com/oquenchil/bazel-contrib/blob/main/cc/tools/cc_shared_library/exports_finder.bzl could perhaps serve as a starting point for you to implement what you described here if it's useful to you.

cpsauer commented 1 year ago

Ah, bummer. OK, well, thanks for keeping that code out in the open. Sounds like AOSP didn't end up using cc_shared_library?

For us, as above, we wanted to slip this this functionality underneath the per-platform stuff, so it can work with all the different packagings (like ios_framework, android, etc.). And I didn't see how to do that with cc_shared_library, though maybe I missed it? Our hacky workaround above has held up pretty well so far! Just, ya know, figured I'd report the use case and try to see if we could get something for everyone, since I kept hearing and seeing the need.

Cheers, Chris

oquenchil commented 1 year ago

since I kept hearing and seeing the need.

As far as I can see you can already write the functionality for this but you want it bundled with Bazel. IIUC the problem is not with the functionality being decoupled but with distribution of such functionality. In other words, you need a custom rule and a macro and currently there is no good place to put those to reach more users except in Bazel itself. I actually think that this being decoupled (for now) is a good idea and the fact that what you wrote can't be easily distributed is a different problem.

I haven't seen the need elsewhere. What I have seen so far is people already have their carefully curated linker scripts file where they control symbol visibility. I'm not denying that there is a use case for this but before adding additional complexity to an already complex rule, I'd prefer if we worked in the other direction instead. First whoever is interested can use the decoupled functionality and if it looks like a substantial number of users are using it then we can always rectify and add it to cc_shared_library. Adding the functionality is cheap but coupling to cc_shared_library has a maintenance and support cost that mostly 1-2 people (official maintainers) will have to pay (plus contributions from the community).

So the problem that we have here is the distribution channel to avoid everyone having to roll their own. For that I'd push in the direction of bazel-contrib where the community can add functionality such as this (or functionality like what I added in https://github.com/oquenchil/bazel-contrib/tree/main/cc/tools/cc_shared_library). Functionality that is not strictly necessary (unlike cc_binary, cc_library and cc_shared_library) but convenient.