facebook / buck2

Build system, successor to Buck
https://buck2.build/
Apache License 2.0
3.37k stars 199 forks source link

Experience trying to configure a wasm32 target platform #391

Closed cormacrelf closed 9 months ago

cormacrelf commented 10 months ago

Let's say I want to make a hello world Rust/WASM binary. As a really basic start, you can at least compile a library for wasm32, by overriding rustc_target_triple in the rust toolchain definition:

# toolchains/BUCK
 system_rust_toolchain(
     name = "rust",
     default_edition = "2021",
+    rustc_target_triple = "wasm32-unknown-unknown",
     visibility = ["PUBLIC"]
 )

But it isn't enough for a binary. Here are the things I had to fix to get a hello world rust wasm binary linked:

  1. Obviously there is no wasm32-unknown-unknown in the big target-triple selector for rust. That's why I added the custom triple in the toolchain to begin with. But you could plausibly add more branches to the select.
  2. To use with --target-platforms I tried adding platform(name="wasm32", constraint_values=["prelude//cpu/constraints:wasm32", "prelude//os/constraints:none"]) to platforms/BUCK. Unfortunately this meant that toolchains//:python_bootstrap could no longer be configured, because it was looking for an OS among linux/macos/windows. Adding an explicit interpreter = "python3" to system_python_bootstrap_toolchain worked for this.
  3. Rust linking a wasm32 binary uses the wrong linker. I tried this on macOS and fedora. The linker script just gives you a long list of unsupported linker options. error: unsupported option '--export', etc. I hacked together a solution, basically you need to:
    1. Use rust-lld as the cxx toolchain's linker. i.e. system_cxx_toolchain(..., linker = "rust-lld").
    2. add rustc_flags = ["-C", "linker-flavor=wasm-ld"] to the rust toolchain; and
    3. prevent the passage of -fuse-ld=lld to the linker (because rust-lld does not support that flag). You can do that by by editing prelude/toolchains/cxx.bzl to add an elif ctx.attrs.linker == "rust-lld": pass before the branch that adds -fuse-ld=lld.

Overall this was not that much work, but prelude needs some patches before this is usable beyond a proof of concept.

Here are some issues with all this:

  1. The big confusion for me was the constraints -- it seems that the prelude's OS and CPU constraints are used for both the platform buck2 is executing actions on, and the target platforms it's building for.
  2. The confusion doesn't seem to be limited to python_bootstrap. I tried adding a couple of reindeer-buckified crates from crates.io, that had build scripts. (in my case, cargo new --lib bob, add a trivial build.rs, add bob={path="./bob"} to the third-party Cargo.toml, reindeer buckify, add in the fixup file [[buildscript]] [buildscript.rustc_flags], add dep on //third-party:bob.) Building this way resulted in exec format error when running the build scripts, because they built wasm32 binaries, when I actually needed some linux aarch64 binaries to run on the host.
  3. I also didn't quite understand from the docs how the OS/CPU constraints would factor into remote execution. That's a side point.
  4. Setting the cxx toolchain's linker to rust-lld globally is an unsustainable hack. Rust only uses it for some targets. You can't expect every cxx build in a project to use rust-lld, just because rustc needs it in for a couple of them. I have no idea what's the best way forward on that.
Jake-Shadle commented 10 months ago

Setting rustc_target_triple gets even worse once you add proc macros into the mix, as it also tries to compile the proc macro crates that should be for the host, for wasm. And if you hack the prelude to not set the target triple so it defaults to the host for proc macro crates, you then get errors because the dependencies of the proc macro crates (eg syn) were not compiled correctly.

cormacrelf commented 9 months ago

I have read the docs on configurations a bit more deeply. I understand a few things better:

cormacrelf commented 9 months ago

I found what looks like an actual problem with the $(location ...) macro.

reindeer buckify with a [buildscript.rustc_flags] fixup generates BUCK files that look like this:

cargo.rust_library(
    name = "bob-0.0.0",
    rustc_flags = ["@$(location :bob-0.0.0-build-script-run[rustc_flags])"],
    ...
)

cargo.rust_binary(
    name = "bob-0.0.0-build-script-build",
    srcs = ["bob/build.rs"],
    ...
)

buildscript_run(
    name = "bob-0.0.0-build-script-run",
    package_name = "bob",
    buildscript_rule = ":bob-0.0.0-build-script-build",
    version = "0.0.0",
)

What you want is for the build script to be an exec dependency. But it's not, because the $(location ...) macro creates target dependencies, much like the $(exe_target //:binary) macro:

https://github.com/facebook/buck2/blob/1818b77fdc87ac8f1163f71801e349649521cfa6/app/buck2_node/src/attrs/attr_type/arg/mod.rs#L199-L209

I don't know what the use case for exe_target is, but I am pretty sure you don't want location to behave like that. If required there can be a corresponding location_target rule.

Upon reflection, you can't just go and change the location macro to always use exec dependencies, because often you will be using it trying to pull out a binary for the target platform. It appears we need:

This is not very nice, but it's necessary in some way or another. I just hacked together "make location behave like location_exec" in a local build of buck2 and now I can build a binary for wasm32 where a buckified crate has a build script.

cormacrelf commented 9 months ago

Oh ... it turns out that while build-script-run is a target dep of the main package, it takes an exec dependency on build-script-build. So the location thing wasn't a problem.

I just didn't have this in my buckconfig, I haven't seen it in any examples outside of the remote execution ones (in which case it points to a remote execution platform) but it seems to be required:

[build]
execution_platforms = prelude//platforms:default

If you don't have this, then you get an exec format error on the build script. What! Ok. I will try to publish a complete example of this working. I got a build script going, but now i'm butting heads with the wasm-bindgen-macros crate complaining about using #[proc_macro] in a crate that isn't a proc macro (it is, surely!!!).

cormacrelf commented 9 months ago

Yeah, beyond that the issue is with the rust_library/rust_binary rules and their inability to differentiate between normal and proc macro dependencies. Proc macro deps are still being built for wasm32-unknown-unknown, even with select()s governing the target triple and linkers etc, because they are not differentiated. I get --crate-type=proc-macro -C linker-flavor=wasm-ld --target wasm32-unknown-unknown in the args passed to rustc, and I think rustc just goes "that ain't my host's target triple, so i'm ignoring the proc-macro crate type".

I don't believe it would be sufficient to simply override the target triple in the rust rules when building a proc macro crate, because you need all the proc macro's own dependencies to be build for the host as well. The proc macros probably need to be actual exec dependencies. I think we may need a separate dependency list for proc-macro crates, such that you can write (and reindeer can also use):

rust_library(
    name = "my-rustlib",
    proc_macro_deps = [":serde_derive-1.0.188"],
    ...
)
rust_library(
    name = "serde_derive-1.0.188",
    proc_macro = True,
    ...
)

And then those deps are resolved with attr.list(attr.exec_dep(), ...).

Unless it is possible to make a configuration transition like "make me an exec dep always", which would be preferable ergonomics-wise.

cormacrelf commented 9 months ago

Alright, it appears there is an RFC describing that idea taken to a more complete conclusion: https://buck2.build/docs/rfcs/drafts/plugin-deps/

It looks like this RFC has been accepted and implementation has started. https://github.com/facebook/buck2/commits/main/app/buck2_core/src/plugins.rs

The remaining work seems to be:

  1. Add uses_plugins to prelude_rule and carry it through
  2. Add RustProcMacro and rust_proc_macro_propagation to prelude, and add the pulls/pushes directives to the other rules.
    • I think this is a little over-complicated from a user perspective, having to define the proc macro as a rust_library and also a rust_proc_macro_propagation. This is fine for reindeer, but not very easy to use directly.
    • This seems to have been devised to get around the fact that rust_library provides RustLinkInfo. It could just be another rule, rust_proc_macro, that doesn't provide RustLinkInfo in the first place, or better yet, use rust_library but provide a different type if it's proc_macro = True.
  3. Modify reindeer to generate that.
JakobDegen commented 9 months ago

Yeah, the implementation of that feature in buck2 core is done. I've also written most of the code to implement it, but haven't gotten around to working out some latent bugs and getting it through code review. I can do that this week and will update the issue when I have

cormacrelf commented 9 months ago

I tried it out and can confirm plugins work great. I implemented the prelude half exactly as described in the RFC and it just worked, so great job on that. I made a more fully fleshed-out example of how to build wasm code including macros from crates.io etc here: https://github.com/cormacrelf/buck2-example. I haven't pushed it yet, but you can also add #[wasm_bindgen] fn main() {}, make a wasm_bindgen rule to run wasm-bindgen on the output, use it to transform a built wasm binary, and then get it running on the web. It's all coming together!

The patches I made to prelude are here, which included some things necessary to get wasm32 going. https://github.com/cormacrelf/buck2-prelude/commits/feature/rust-proc-macro-plugins

I did hit this while debugging, if it helps:

https://github.com/facebook/buck2/blob/4cb68186e921794e7a5b6cc9345183922cabad74/app/buck2_node/src/nodes/configured.rs#L581-L582

I don't know what forwarding nodes are, but it was necessary at one point to change that second line to TargetNodeOrForward::Forward(_, node) => node.uses_plugins(), to get past an error that was something like "no method uses_plugins" or something. At the end of the day I stopped hitting that case, so I don't know if it is really a problem or not.

cjhopman commented 9 months ago

This seems to have been devised to get around the fact that rust_library provides RustLinkInfo. It could just be another rule, rust_proc_macro, that doesn't provide RustLinkInfo in the first place, or better yet, use rust_library but provide a different type if it's proc_macro = True.

The challenge is really about the fact that transitive dependents need to basically have proc macro deps far down in the graph from them appear to them as though they are exec deps of the node itself. They need to affect that node's execution platform resolution and they need to be configured to target that resolved execution platform. If you just tried to propagate up info through providers of dependents of the proc macro target, it would mostly work (especially when you only have the single host execution platform), but it would be fragile and if you encountered a case where it broke there may be no way to resolve it. plugin_deps supports propagating the information up the graph to make the plugins act as exec deps of the node that uses them.

cormacrelf commented 9 months ago

What's there works, so I guess that's borne out. It is very nice that the macros simply end up as exec dependencies like any other. From my point of view, none one of the details are that important as long as the interface is easy to use. This is the shim around rust_library that I used for reindeer; if there is going to be something similar in the prelude, then I'm happy.

The motivating case was lib > proc1 > proc2 where I believe lib will get both proc macros as plugins, even though proc1 could get away with pushing itself only (because proc macro crates can't re-export proc macros). With plugins as they are today, I think you would need a rust_proc_macro rule to solve that, so that the deps attribute could be declared as pulls_plugins only. And if you were doing that, you would kinda want to get rid of the propagation node. So I was just thinking about how to get it done without the propagation node. But the more I think about it, even if e.g. a rust_proc_macro target could push itself as a plugin, removing it from the deps during analysis would be weird and unexpected, so you would probably end up with some form of propagation node anyway. Maybe it would be a builtin of some kind that only appears along dep edges, so that the buck2 build :p1 referring to a propagation node problem would go away.


Edit: hmm, if it suppresses the regular dep and then gets pulled as a plugin, that means it just appears as an exec dep. That wouldn't be too bad!

JakobDegen commented 9 months ago

@cormacrelf the commits implementing all the proc macro stuff in the prelude landed on master about an hour ago. I'm a little bit unclear as to the status of this issue right now - are there any concrete things I can help with?

cormacrelf commented 9 months ago

@JakobDegen Oh, awesome, thanks for pinging me. The remaining thing for wasm32 in particular is something like this change here:

https://github.com/cormacrelf/buck2-prelude/commit/fa56cdfd60e76d2ab4e1ea8b091d02af911f69cb

rust-lld is a binary found in rust sysroots, in essentially a "libexec"-type directory that is in the path somehow during rust builds. It doesn't accept some of the extra flags like below in that if/else chain -- -fuse-ld=lld for example. Special-casing it in the system_cxx_toolchain is sufficient to be able to just select() rust-lld on a platform you define, as I do in that example repository. I'm sure there is a more comprehensive solution but that would be a big help.

JakobDegen commented 9 months ago

With a comment that seems vaguely reasonable, although I'm far from an expert on the linking stuff. If you send a PR I can get the right eyes on it internally though

Jake-Shadle commented 9 months ago

One other problematic part with targeting wasm32 is that the python toolchain is using the config:os to determine the name of the interpreter here, which @cormacrelf gets around by just explicitly setting https://github.com/cormacrelf/buck2-example/blob/8d7336097c16d076c67881b3ac40304231dad970/toolchains/BUCK#L50, but I'm guessing that doesn't work on Windows? I don't know enough about buck2 yet to be super confident, but this seems like a place where the host and target platform are getting confused with each other.

JakobDegen commented 9 months ago

@Jake-Shadle I'm not quite sure that's the case. config//os doesn't necessarily refer to either the host or target platform in the way that eg cargo thinks of those concepts (buck2 has no global concept of a host or target platform). Instead it just refers to the platform with which that rule is currently configured. python toolchains should always appear as an exec dep of some kind, which means they'll always end up being configured for something resembling the host platform, and this should work out.

What goes wrong if you don't set that?

cormacrelf commented 9 months ago

See my PR #407.

Jake-Shadle commented 9 months ago

This is closer, but the latest prelude changes to support wasm32 throw an error if there is a platform attribute in a rust crate, eg.

rust_library(
    name = "ahash-0.8.3",
    srcs = [":ahash-0.8.3.crate"],
    crate = "ahash",
    crate_root = "ahash-0.8.3.crate/src/lib.rs",
    edition = "2018",
    features = [
        "default",
        "getrandom",
        "runtime-rng",
        "std",
    ],
    platform = {
        "linux-arm64": dict(
            deps = [":once_cell-1.18.0"],
        ),
        "linux-x86_64": dict(
            deps = [":once_cell-1.18.0"],
        ),
        "macos-arm64": dict(
            deps = [":once_cell-1.18.0"],
        ),
        "macos-x86_64": dict(
            deps = [":once_cell-1.18.0"],
        ),
        "wasm32": dict(
            deps = [":once_cell-1.18.0"],
        ),
        "windows-gnu": dict(
            deps = [":once_cell-1.18.0"],
        ),
        "windows-msvc": dict(
            deps = [":once_cell-1.18.0"],
        ),
    },
    rustc_flags = ["@$(location :ahash-0.8.3-build-script-run[rustc_flags])"],
    visibility = [],
    deps = [
        ":cfg-if-1.0.0",
        ":getrandom-0.2.10",
    ],
)

Fails with

buck2-example [squashed●] % buck2 build //src/wasmlib:wasmlib --target-platforms platforms//:wasm32
Error running analysis for `root//src/wasmlib:wasmlib (platforms//:wasm32#3295263a5955aa9b)`

Caused by:
    0: Error looking up configured node root//src/wasmlib:wasmlib (platforms//:wasm32#3295263a5955aa9b)
    1: Error looking up configured node root//third-party:serde_derive (platforms//:wasm32#3295263a5955aa9b)
    2: looking up unconfigured target node `root//third-party:serde_derive`
    3: Error loading targets in package `root//third-party` for target `root//third-party:serde_derive`
    4: Error evaluating build file: `root//third-party:BUCK`
    5: Traceback (most recent call last):
         * third-party/BUCK:25, in <module>
             rust_library(
         * prelude/native.bzl:398, in _rust_library_macro_stub
             rust_library(**kwargs)
         * prelude/rust/rust_library.bzl:744, in wrapper
             rust_library(**kwargs)
         * prelude/rust/rust_common.bzl:13, in rust_common_impl
             rust_rule(_workspaces = workspaces, **kwargs)
       error: Found `platform` extra named parameter(s) for call to rust_library
         --> prelude/rust/rust_common.bzl:13:9
          |
       13 |         rust_rule(_workspaces = workspaces, **kwargs)
          |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
          |
cormacrelf commented 9 months ago

The buck2-example repo is using a temporary shim.rust_library, which is a wrapper for cargo.rust_library. If you're updating it for mainline buck as of today, you'll want to make sure it uses cargo.rust_library again. Your stack trace there indicates you've used the native/prelude rust_library rule, which does not handle reindeer's output. This can be configured using the reindeer.toml file.

When I get some time I will update buck2-example for mainline buck and give you a ping here. I want to explore configuring a custom cxx toolchain to avoid having to special-case rust-lld in system_cxx_toolchain. Currently finding it's a lot of work to get a cxx toolchain configured. I might just repurpose the zig one (or just use it), which seems a better option anyway -- you should be able to compile a rust wasm binary that depends on C or C++ libraries, not only pure-rust projects. zig cc is a winner at that game.

But for now I think I will close the issue. There's nothing else actionable at this time.