bazelbuild / rules_rust

Rust rules for Bazel
https://bazelbuild.github.io/rules_rust/
Apache License 2.0
656 stars 423 forks source link

Cargo project with multiple artifacts; How can I match output filenames? #110

Closed photex closed 2 years ago

photex commented 6 years ago

Howdy!

I have a cargo manifest like so:

[package]
name = "my-app"
version = "0.1.0"
authors = [<me>]

[dependencies]
libc = "0.2"
log = "0.4"
env_logger = "0.5"
bytes = "0.4"
failure = "0.1"
futures = "0.1"
tokio = "0.1"
tokio-signal = "0.2"
qp-trie = "0.7"
rmp = "0.8"
rmpv = "0.4"

[lib]
crate-type = ["cdylib", "rlib"]

In src I have lib.rs and main.rs. Using cargo I get 3 artifacts that are used by other parts of the existing build: libmy_app.so, libmy_app.rlib, and the program my-app.

I can get a working bazel build with something like this:

package(default_visibility = ["//visibility:public"])

load("@io_bazel_rules_rust//rust:rust.bzl", "rust_binary","rust_library")

COMMON_DEPS = [
    "//vendor/crates:libc",
    "//vendor/crates:log",
    "//vendor/crates:env_logger",
    "//vendor/crates:bytes",
    "//vendor/crates:failure",
    "//vendor/crates:futures",
    "//vendor/crates:tokio",
    "//vendor/crates:tokio_signal",
    "//vendor/crates:qp_trie",
    "//vendor/crates:rmp",
    "//vendor/crates:rmpv",
]

rust_library(
    name = "my_app_client",
    srcs = ["src/lib.rs"],
    deps = COMMON_DEPS,
    crate_type = "cdylib",
)

rust_binary(
    name = "my_app",
    srcs = ["src/main.rs"],
    deps = COMMON_DEPS,
)

But this gives me a binary my_app instead of my-app which is probably something I can live with; but my shared library is just totally wrong now libmy_app_client--<HASH>.so.

I'm generating the header for the library using cbindgen and this shared library is used in other parts of our build so the name isn't entirely a problem for that purpose, but it is also part of the package given to clients so this is ultimately not going to work.

I've tried to use -o libmy_app.so in the rustc flags but the build fails because it can't find libmy_app_client-<HASH>.so. :D Looks like bazel and I are at another impasse.

Am I missing an obvious solution or configuration here?

photex commented 6 years ago

I created a repo that illustrates the problem: https://github.com/photex/bazel_rust_tests

acmcarther commented 6 years ago

On intention

But this gives me a binary my_app instead of my-app which is probably something I can live with; but my shared library is just totally wrong now libmy_app_client--.so.

Are your expectations that a library built using Cargo alone should generate the same artifact names as with Bazel? I don't think this is impossible, but it wouldn't be trivial, and it couldn't be guaranteed to remain consistent with Cargo. Your following questions indicate to me though that you actually don't care about the output name -- you just want to be able to link it. Bazel does this using target names and deps, not by directly specifying filenames (apologies if this is already clear to you).

I've tried to use -o libmy_app.so in the rustc flags but the build fails because it can't find libmy_app_client-.so. :D Looks like bazel and I are at another impasse.

Are you trying to export a shared library for other consumers outside of Bazel? In this case, you should use a packaging rule like pkg_tar (with a dependency on the rust_library) to package the file up. Are you trying to link the shared library to the binary? If so, you should add a dependency on the shared library to your deps.

It would be possible in a genrule to rename the output dylib if that was something you were interested in for the purpose of exporting the lib out of Bazel.

Conversely, if you're just trying to call Rust from C, check out the ffi example: https://github.com/bazelbuild/rules_rust/blob/master/examples/ffi/c_calling_rust/BUILD . It doesn't use a dylib, but I don't think there is a reason why it couldn't (@mfarrugi might know something I don't).


On Filenames rules_rust uses a different mechanism to generate the output file name than cargo does for practical reasons.

bazel https://github.com/bazelbuild/rules_rust/blob/1944c8a9c104fb73e3058a86204373df63b79eca/rust/rust.bzl#L197-L198

cargo https://github.com/rust-lang/cargo/blob/ef719bc5e5966d99cce14a092a0525eba399ea1d/src/cargo/core/compiler/context/compilation_files.rs#L97-L100

Side note: The extra "-" in your new output file is due to bazel's hash generating both negative and positive numbers. I didn't find this to be an issue in reality, so I left it that way.


On expectations

I don't have a complete picture of what you're trying to achieve, but I want to caution you that Cargo and Bazel really aren't compatible at all. If you're trying to export a compiled library for other Rust users who use Cargo, you're going to be breaking some very new ground. I've built hacks for Bazel to get at Cargo crates (because, lets face it, the language has a minuscule stdlib and you need access to crates to be able to write anything in it), but we're at a very early stage in the cross compatibility story.

I'm excited to see this discussion regardless, and happy to help sort through it.

photex commented 6 years ago

Sorry if I'm dropping a lot of comments and issues today, I decided to really sit down and dedicate the day to working through this stuff. Thanks for your patience and thoughtful comments!


In terms of the names of targets my only concern is with the shared object. That is both used internally (and it could be called anything for all it matters) and externally (whereby it's important that it is named libmy_app.so). The externally provided "dev package" with the client library and associated header would potentially be used by any number of build systems.

The difference in filenames for the executable is purely one of convention. Most of the executables are meant to be installed via a product deb package and managed by systemd. As long as clients can link with the client library internal tools and software built by clients should all be on the same page.

Let's ignore the rlib to simplify matters for the time being. I included that to provide a rust library for potentially external rust client software even though I honestly have no idea how those dots connect yet :grin:.

The majority of our product is C++, but this facet of Rust is a keystone so that's why I'm starting with it. We currently build for aarch64 and amd64 and our existing build drops things into a particularly arranged output directory (this allows for easy on-device testing over nfs for example). I figure a genrule or some package rules are totally in-order and acceptable but since I'm just getting my feet wet here it's hard to know when that's the correct solution versus having the primary rule just create the correctly named file. For packaging up the client library and services to clients I anticipated using the pkg_deb rule.

I admit I didn't think of using a genrule to rename the shared object... I just assumed it wouldn't be linkable with the name changed.

Part of the reason I'm stepping into these issues is because I'm trying to bootstrap a bazel build that lives beside our current cmake/cargo/bash extravaganza (soon to include docker). I'm open to any organization changes as long as the end results are largely compatible; and since it's clear that the support for a completely multi-arch build isn't quite ready yet I can't just convert everything all at once.

acmcarther commented 6 years ago

Ok, I think I have a better picture for what you've got in mind. Lets see:

In terms of the names of targets my only concern is with the shared object. That is both used internally (and it could be called anything for all it matters) and externally (whereby it's important that it is named libmy_app.so). The externally provided "dev package" with the client library and associated header would potentially be used by any number of build systems.

For usage internally (within Bazel as well), you can just add a dep to whatever wants the crate. If we're talking rust-depending-on-rust, that means your dependent should look like

rust_library(
  name = "some_crate",
  srcs = WHATEVER,
  deps = [
     "//path/to:my_app_client"
  ],
)

For an internal C(++) user in bazel, you'd go with

cc_library(
  name = "the_same_kind_of_thing",
  srcs = WHATEVER_ELSE,
  deps = [
    # The path to the *cdylib*. This isn't a magical name, just the name of the rust_library 
    # with cdylib out that you chose
    "//path/to:my_app_client_cdylib"
  ]
)

For an ~internal~ external user of either variety, you'll probably want to export the so with a name that isn't whack. You could do this with a genrule. Heres a kind of sketch. This declares a macro to wrap a genrule that copies (and implicitly renames) the file.

def rename_so(name, src, out_name = None):
  actual_out_name = out_name
  if not actual_out_name:
    actual_out_name = name

  genrule(
    name = "rename_so_" + name,
    srcs = [src],
    outs = [actual_out_name + ".so"],
    cmd = "cp $< $(@)",
  )

# Yields "my_app_client.so"
rename_so(
  name = "my_app_client",
  src = "//path/to:my_app_client_cdylib",
)

I strongly believe that this renamed so should be directly includeable as a dep in either a rust_library or a cclibrary, but I'm not 100% certain. It would certainly be usable in any pkg* rule.

Part of the reason I'm stepping into these issues is because I'm trying to bootstrap a bazel build that lives beside our current cmake/cargo/bash extravaganza (soon to include docker). I'm open to any organization changes as long as the end results are largely compatible; and since it's clear that the support for a completely multi-arch build isn't quite ready yet I can't just convert everything all at once.

I don't have much experience bringing up Bazel alongside an existing build system, though I know it has been done. I suspect that a hybrid build system where cmake depends on Bazel output or vice versa might be perilous though compared to a "big bang" where you make Bazel work alongside cmake and eventually throw a lever and switch everything over to Bazel.

photex commented 6 years ago

These examples are great! Putting them to use right now as I march along to the next target. :D

I'll quickly verify that renamed libs work as expected and report back. I have very likely just internalized some relic of autotools behavior. Although on OSX I believe I'd also have to use otool to fixup the name for dyld... but it's been a while so maybe even that knowledge is just cobwebs.

I don't have much experience bringing up Bazel alongside an existing build system, though I know it has been done. I suspect that a hybrid build system where cmake depends on Bazel output or vice versa might be perilous though compared to a "big bang" where you make Bazel work alongside cmake and eventually throw a lever and switch everything over to Bazel.

I'm actually aiming for the "big bang" with the caveat that I've chosen the unenviable task of working and supporting two builds for the time being. For Bazel, the idea is that I'm putting together all the targets for amd64 first and will work out the details of cross-compiling for aarch64 after that. Until we can cross-compile for aarch64 it wouldn't be possible to switch. I very much hope I can be of service in this effort.

acmcarther commented 6 years ago

I have a pretty good idea of what is left as far as finishing implementing cross compilation support, but I just need to find some time to actually do it. I suspect it might take me longer to mentor you to finish up the implementation than it would take for me to just sit down and do it. Having someone who actually has a need for the feature is already pretty helpful though as it gives me someone to bounce the API+Implementation off of that might actually use it.

I can commit to having a PR up for you to examine / try by the end of Friday (UTC-8).

photex commented 6 years ago

Exciting stuff! I verified that the renamed library works as intended. I now have the core application, client library, and two tools building and running.

I wasn't able to use the macro you specified. I just used the genrule portion of it.

If you manage to get something together by Friday then I'm totally happy to kick the tires but please don't go too far out of your way. Tomorrow I'll be attending to a family health issue, but after that I'll back and digging around.

Cheers!

acmcarther commented 6 years ago

TL;DR: Not done yet, unfortunately

I worked on this somewhat over the last couple of days, but hit a pretty annoying roadblock that I'd like to put down in text somewhere so that I have something to come back to after I sleep on it. Here seems good.

Currently rules_rust piggybacks off of the C++ toolchain for its linker (and ar, but this doesn't matter that much): https://github.com/bazelbuild/rules_rust/blob/1944c8a9c104fb73e3058a86204373df63b79eca/rust/toolchain.bzl#L50-L53

This means that we sort-of implicitly start to get mired in CROSSTOOL chaos. I spent some time trying to cut that corner by adding an optional linker_bin param to the rust_toolchain decl, but this caused so many unrelated headaches around paths in the Bazel sandbox that I think just using CROSSTOOL properly is advisable.

The consequence of this is that I need to learn how to write a proper CROSSTOOL so that I can put together a usage example. All told this will require only minor rules_rust changes to the tune of https://github.com/bazelbuild/rules_rust/compare/acm-cross-compilation plus making rust_toolchain.attr._crosstool public via the rust_repository_set attrs, but the actual usage and CROSSTOOL shenanigans will be non-trivial for users.


For a sense of what happens without the above fixes:

bazel run @examples//hello_world:hello_world --platforms="@examples//:arm-unknown_linux-gnueabi" --verbose_failures
INFO: Build options have changed, discarding analysis cache.
INFO: SHA256 (https://static.rust-lang.org/dist/rust-std-1.26.1-arm-unknown-linux-gnueabi.tar.gz) = ef56f63f35dd90a842388e4b44a0bc37a17ef422bcbc9827818cae561feda424
INFO: Analysed target @examples//hello_world:hello_world (9 packages loaded).
INFO: Found 1 target...
ERROR: <snip>
Use --sandbox_debug to see verbose messages from the sandbox
error: linking with `/usr/bin/gcc` failed: exit code: 1
  |
  = note: <snip>
  = note: /usr/bin/ld: bazel-out/k8-fastbuild/bin/external/examples/hello_world/hello_world.hello_world0.rcgu.o: Relocations in generic ELF (EM: 40)
          /usr/bin/ld: bazel-out/k8-fastbuild/bin/external/examples/hello_world/hello_world.hello_world0.rcgu.o: Relocations in generic ELF (EM: 40)
          /usr/bin/ld: bazel-out/k8-fastbuild/bin/external/examples/hello_world/hello_world.hello_world0.rcgu.o: Relocations in generic ELF (EM: 40)
          bazel-out/k8-fastbuild/bin/external/examples/hello_world/hello_world.hello_world0.rcgu.o: error adding symbols: File in wrong format
          collect2: error: ld returned 1 exit status

As a side note, we wouldn't have to deal with this at all and could drop our CROSSTOOL dependency if LLD was production ready. Unfortunately the Rust integration is still nightly only (since it's a -Z flag to rustc), and has some issues that make trying to use it anyway inadvisable.

Long term LLD tracking issue: https://github.com/rust-lang/rust/issues/39915

Tertiary LLD PR: https://github.com/rust-lang/rust/pull/51936

photex commented 6 years ago

Howdy :D

I admit I do not yet know enough of the spirit of Bazel to immediately make some suggestions on this although I really appreciate you writing it all out because it helps me understand what I'm seeing when I peek under the hood.

In the meantime I attempted to write a non-sandboxed genrule to just run cargo... which didn't work. I felt like it's a potential trade-off for the short-term and would work the same as the cmake integration:

genrule(
    name="cargo_aarch64",
    tags=[
        "no-sandbox",
    ],
    cmd="cargo build --release --target=aarch64-unknown-linux-gnu -p my-app",
    outs=[
        "target/aarch64-unknown-linux-gnu/release/libmy_app.so",
        "target/aarch64-unknown-linux-gnu/release/my-app",
    ]
)

This does in fact build everything correctly; but the genrule fails stating that it was not responsible for those files being created. :face_with_thermometer:

acmcarther commented 6 years ago

Running cargo as a genrule is seriously brittle, in that Bazel has little means of:

Furthermore, Cargo itself isn't really amenable to fixing any of those problems because it plays so many roles in the build process (dep resolution, dep fetching, build planning, compiler execution) and isn't modular at all.

I was very serious in https://github.com/bazelbuild/rules_rust/issues/110#issuecomment-409717719 when I said

... I want to caution you that Cargo and Bazel really aren't compatible at all...

mfarrugi commented 6 years ago

... I want to caution you that Cargo and Bazel really aren't compatible at all...

I feel like this a bit overstated. At the edges (ie. relatively low level libraries) this is true, but many libraries (all of those that I've touched), and so most users, only want to enumerate the crates they depend on.

@photex you would need to declare 100% of the input files, including what 'cargo' is, the Cargo.toml, and the source fields that it's allowed to use; though I would also recommend against it in general.

photex commented 6 years ago

I'm not suggesting a permanent approach, just something to kick the tires that works at all for aarch64. I was trying to do this without the sandbox. Things build perfectly fine, but genrule doesn't seem to be able to see or find the resulting build.

This was only attempting to have bazel run a command that would be otherwise just required for someone to run themselves.

In either case I'm not the biggest fan of the Cargo monoculture in rust-land so I'm very much in favor of finding the correct bazel methodology.

photex commented 6 years ago

Some thoughts on specifying the linker for cross-compilation:

Currently we make use of .cargo/config to specify the linker we require. This is for our uses practical (if not a little bit of a pain) and we definitely need an easy way to control that. I'm attempting now to follow examples for bazel and I'm setting up an aarch64 toolchain by downloading a prebuilt linaro toolchain and sysroot; but it'll also be the case that we need to leverage a toolchain and sysroot provided with the development hardware. Ideally this would mean that the rust rules would just use whatever linker is currently selected via the active crosstool_top option; but being able to specify a linker via command line and then bundling that behind a --config=foo option would also be totally fine.

photex commented 6 years ago

Howdy @acmcarther

I'm going through the repo and trying to understand everything.

Currently I'm trying to understand why https://github.com/bazelbuild/rules_rust/blob/1944c8a9c104fb73e3058a86204373df63b79eca/rust/toolchain.bzl#L50-L53 doesn't give the linker for the active toolchain?

With my current CROSSTOOL and cc_toolchain:

package(default_visibility = ["//visibility:public"])

filegroup(
    name="empty",
    srcs=[],
)

cc_toolchain_suite(
    name = "toolchain",
    toolchains = {
        "k8|clang6": ":cc_clang6_amd64",
        "aarch64|gcc7": ":cc_gcc7_aarch64",
    },
)

cc_toolchain(
    name = "cc_clang6_amd64",
    all_files = "//tools/compilers/clang6_amd64:all_components",
    compiler_files = "//tools/compilers/clang6_amd64:compiler_components",
    cpu = "k8",
    dwp_files = ":empty",
    dynamic_runtime_libs = [":empty"],
    linker_files = "//tools/compilers/clang6_amd64:linker_components",
    objcopy_files = "//tools/compilers/clang6_amd64:objcopy",
    static_runtime_libs = [":empty"],
    strip_files = ":empty",
    supports_param_files = True
)

cc_toolchain(
    name = "cc_gcc7_aarch64",
    all_files = "//tools/compilers/gcc7_aarch64:all_components",
    compiler_files = "//tools/compilers/gcc7_aarch64:all_components",
    cpu = "aarch64",
    dwp_files = ":empty",
    dynamic_runtime_libs = [":empty"],
    linker_files = "//tools/compilers/gcc7_aarch64:all_components",
    objcopy_files = "//tools/compilers/gcc7_aarch64:objcopy",
    static_runtime_libs = [":empty"],
    strip_files = ":empty",
    supports_param_files = True
)

I only need to use bazel build --cpu=aarch64 to have it select my aarch64 toolchain and I'd expect then that the cpp fragment is using information from there.

Obviously it's not, and it's always getting the linker/compiler from my clang6 toolchain instead. I'm trying to connect the dots there and understand why.

photex commented 6 years ago

Ok, I fiddled around until I could cross_compile.

I don't understand how fragments and host_fragments are being selected, so I just added attrs to the rust_toolchain rule for linker_bin and ar_bin. Since I have relative paths built-in to the repo this worked, but the next snag is actually with a proc-macro crate 'failure_derive'. That's resulting in an *.so built for aarch64 instead of for the host platform.

Also, toolchain selection by cpu wasn't working at all; I had to use a platform definition instead.

photex commented 6 years ago

Welp, procedural macros are a can of worms that are beyond the bounds of my bazel knowledge at the moment. I can always check for the type of that crate and build for the host then, but every one of it's dependencies need that treatment as well and that's where I run into trouble.

I've run out of time on this so I think I'm going to have to either hack up a genrule, abandon bazel for now, or perhaps wrap a cargo/bazel combo build with a python script. :shrug: