Closed davidzchen closed 3 years ago
I would love to help out with this feature.
I am very interested in using Bazel to build projects with Rust code and Cargo support is a must-have. Is the linked document about repository rules the best place to start, or are there other good documents to read?
Awesome! The document linked above is the original design doc for Skylark repository rules. We now have documentation for repository rules as well.
I took a brief look at the cargo
tool and was not able to find a way to specifically download a single crate; the cargo
commands are rather high-level and for end-to-end use cases. We would have to put together our own tool for fetching crates, which should be rather straightforward using the Crates.io API (which does not seem to be publicly documented):
❯❯❯ http get https://crates.io/api/v1/crates/acc_reader
HTTP/1.1 200 OK
Connection: keep-alive
Content-Encoding: gzip
Content-Type: application/json; charset=utf-8
Date: Mon, 25 Apr 2016 22:34:51 GMT
Server: nginx
Set-Cookie: cargo_session=--PBVtejYpqgihoNU4gy+Z6jCDXMg=; HttpOnly; Secure; Path=/
Strict-Transport-Security: max-age=31536000
Transfer-Encoding: chunked
Via: 1.1 vegur
{
"crate": {
"created_at": "2016-01-30T22:53:11Z",
"description": "A wrapper for std::io::Read providing std::io::Seek",
"documentation": "http://netvl.github.com/acc_reader/",
"downloads": 122,
"homepage": null,
"id": "acc_reader",
"keywords": [
"seek",
"reader",
"input"
],
"license": "MIT/Apache-2.0",
"links": {
"owners": "/api/v1/crates/acc_reader/owners",
"reverse_dependencies": "/api/v1/crates/acc_reader/reverse_dependencies",
"version_downloads": "/api/v1/crates/acc_reader/downloads",
"versions": null
},
"max_version": "2.0.0",
"name": "acc_reader",
"repository": "https://github.com/netvl/acc_reader",
"updated_at": "2016-02-01T21:10:08Z",
"versions": [
21719,
21622
]
},
"keywords": [
{
"crates_cnt": 1,
"created_at": "2016-01-30T22:53:11Z",
"id": "seek",
"keyword": "seek"
},
{
"crates_cnt": 8,
"created_at": "2015-01-10T22:21:22Z",
"id": "reader",
"keyword": "reader"
},
{
"crates_cnt": 19,
"created_at": "2014-12-12T13:53:10Z",
"id": "input",
"keyword": "input"
}
],
"versions": [
{
"crate": "acc_reader",
"created_at": "2016-02-01T21:10:08Z",
"dl_path": "/api/v1/crates/acc_reader/2.0.0/download",
"downloads": 111,
"features": {},
"id": 21719,
"links": {
"authors": "/api/v1/crates/acc_reader/2.0.0/authors",
"dependencies": "/api/v1/crates/acc_reader/2.0.0/dependencies",
"version_downloads": "/api/v1/crates/acc_reader/2.0.0/downloads"
},
"num": "2.0.0",
"updated_at": "2016-02-01T21:10:08Z",
"yanked": false
},
{
"crate": "acc_reader",
"created_at": "2016-01-30T22:53:11Z",
"dl_path": "/api/v1/crates/acc_reader/1.0.0/download",
"downloads": 11,
"features": {},
"id": 21622,
"links": {
"authors": "/api/v1/crates/acc_reader/1.0.0/authors",
"dependencies": "/api/v1/crates/acc_reader/1.0.0/dependencies",
"version_downloads": "/api/v1/crates/acc_reader/1.0.0/downloads"
},
"num": "1.0.0",
"updated_at": "2016-01-30T22:53:11Z",
"yanked": false
}
]
}
Ideally, it would be nice to reuse some of the code in Cargo itself, but I haven't looked into the implementation in enough detail to know how feasible it is.
Thanks for the information! I've been looking at Cargo's libraries to see what code we can reuse. It'd be really nice if I could write a cargo command to do what we need.
There is a third-party Cargo subcommand cargo clone
that can fetch a crate's source from crates.io (it also sounds like it may get included in future Cargo releases: link). I figure we can install the subcommand via cargo install
and use it to fetch a cargo_crate
's source code.
Bikeshed: I might prefer this be called "cargo_source" or "cargo_crate_source", to indicate that these are not valid targets for an arbitrary rust_library
to depend on.
Good point. The word crate in rust parlance seems to more accurately refer to compiled binaries rather than the source code. Let's go with cargo_source
.
Would you like to take this one since you already have code for fetching crates in your cargo2bazel tool? Having rules call a Python command line tool is completely fine and is the approach taken by the pkg_*
and Skydoc build rules.
Yeah, I can take this issue.
Thanks!
Update:
TL;DR: [cargo_source + WORKSPACE codegen + BUILD codegen] isn't likely to pan out as well as a hack around cargo, at least in the short term
I don't have a PR up yet, but I'm currently spiking something more like a cargo_toml
repository rule. Currently, I'm not confident that a cargo_source
is a good seam between cargo and bazel, because it forces us to do a bunch of the work that cargo already does (some of which I did in cargo2bazel). I think we'll always be playing catch up unless we let them handle dependency resolution end to end, which is very hard to do at the source level of granularity.
This line of reasoning was spurned initially when I was exploring repository rules in the abstract -- and I realized that these cargo_sources
would have to be namespaced independently. For example, a "libc" import would have to live somewhere in "@libc//...:libc", while a "clap" dependency would have to live in "@clap//...:clap". This could quickly get pretty confusing if you're automatically generating these and end up with more than one version of a dependency (very common!). And, at the end of the day, the workspace probably doesn't care about most of these! We don't want visibility into transitive dependencies, or to handle the complexity of feature flags for transitive dependencies.
I recall that kythe is using something quick and dirty that calls cargo directly. I'd bet we could support a 100% integration into cargo by wrapping and cleaning that up, since cargo already builds targets that our own rust_library
rules can consume.
All of that said, we could always go the whole way, and replicate as much of cargo's functionality as possible, generate cargo sources, multiple build rules for each feature flag subset, and handle target platforms, but I'm not confident that's the fastest (or most reasonable) approach.
cargo_source
is just one piece of the solution. It just gives us a workspace rule that can be used to download the source for a single crate for building. The larger piece would be the proposed cargo raze
tool, which will use Cargo's code to generate BUILD and WORKSPACE files based on the dependency graph constructed by Cargo.
For calling cargo directly, do you mean doing so for the most troublesome crates, such as -sys
crates? For that case, it may be a good stopgap measure. However, as discussed in rust-lang/rust-roadmap#12, the problem with simply having the rules call cargo
in general is that builds would not be able to be hermetic and we would not be able to take advantage of the features offered by Bazel.
I think the cargo_source
+ cargo-raze
approach is feasible and the better solution for integrating Rust with Bazel in the long term. Perhaps for now, calling cargo directly for troublesome crates in the most troublesome leaves of the dependency graph that we cannot otherwise cover would be a good stopgap measure. If there is a way to tell cargo to not download dependencies but rather provide paths to the dependencies, that might be better. The multiple dependencies issue should be fixed once @kchodorow's recursive WORKSPACE file parsing lands. Implementing this approach for Bazel, as well as similar efforts being taken by others for integrating with other build systems, will also require changes to Cargo, such as surfacing more information about the dependency graph, build plan, etc., which is the purpose of the ongoing discussion on rust-lang/rust-roadmap#12. The feature requests that have been opened as a result of that discussion, such as rust-lang/cargo#3815 and rust-lang/cargo#3816 will help with implementing this approach as well.
Apologies for the delayed reply. I will post a more detailed proposal for the approach that I have min mind shortly.
My own apologies -- I haven't updated this with my latest stuff.
Agreed -- "cargo compile" probably isn't where we want to work from -- we should prefer "rust_library" generally. However, I'm still skeptical of "cargo_source" as a useful unit. It papers over the relationships between sources that are intrinsic to a "source", and really is synonymous with "new_http_archive". As mentioned before, I think having a unique external_repository for each source adds more complexity, and leads to more ambiguity than necessary, esp when multiple versions of a crate are involved. My impression is that we'll be fighting cargo a bit more than is useful.
I'll provide more feedback on a future doc, but I'm thinking we should lean on a cargo plugin to generate a single workspace with all dependencies. cargo-vendor is a good starting point there. After that, I think we're on the same page -- use another (or the same) cargo plugin to generate build rules and link the dependencies using cargo's own build planning. I don't think we need to wait for a special cargo change to do this, provided we have an appetite for cargo spelunking. I've already started on that myself.
In short, a cargo_toml(...) workspace rule accepts an existing toml and, does some cargo-vendor like behavior into a genfiles directory, and some custom code runs cargo's compile machinery to generate the build targets. We can map those targets to rust_library rules without much fuss.
Looking forward to seeing your proposal!
Heres a quick hack up of the cargo part of the workflow: https://github.com/acmcarther/cargo-raze-vendor
It generates WORKSPACES like this: https://github.com/acmcarther/cargo-raze-vendor/blob/prototype/cargo/WORKSPACE
I haven't built any of the bazel-side glue (cargo_toml, new_crate_repository, crate_repository), but I think they'll be easier to work through. The jist of it is that cargo_toml shells out to cargo-raze-vendor, and the *crate_repository rules work just like git_repository rules except that they know how to pull off of crates.io.
I should have a PR up specifically for this issue very soon (though it might be called new_crate_repository -- we can bikeshed on that when the PR is up).
EDIT:
Lets sync up some time soon. Theres some thinking behind this super roundabout workflow (bazel_rule -> cargo -> bazel_rules). Its my way of addressing target-specific dependencies -- defer the dependency planning.
EDIT2:
I'm not sure we have anything to gain from a cargo_source or new_crate_repository rule or anything like that. There's no additional things we can infer about the crate (because it depends on the crates around it). I've opted to use new_http_archive
to pull crates, and build_file_content
to specify the steps needed.
The fundamental issue, and the dissonance between what folks expect and what they'll get, is that a "crate" doesn't make sense in a vacuum. You need a set of desired crates and a resolver to nail down the precise dependencies, versions, and feature flags.
Im interested in helping on pushing this issue forward. What else needs to be done?
@softprops I'm still totally working on it and haven't abandoned it. I can give an update and see if there's something you think you might be able to explore as well:
TL;DR: cargo-raze is the cutting edge, with no great story for multiplat or build.rs files yet. rules_cargo is a totally different philosophy (and has awful docs). Both are pretty unpolished.
I've been exploring the solution space for the last couple of months. I've investigated: 1) cargo-free workflows. Use cargo to squeeze out whatever dependency information it knows about your crates, and synthesize BUILD files from that (maybe deferred). cargo-raze fits in here. 2) cargo tolerant workflows. Use cargo to build a crate or two, but write your code using rust_library. rules_cargo does this, but there are some hairy questions about propagating linked dependencies. I can explain more if you'd like here. 3) Full cargo workflows. Ditch rust_library completely. I don't think parallel build systems is in the "spirit" of bazel, so I've not dug too deep here. This also brings weird questions like "how do i use only one version of some C library", since Cargo will just bring in whatever it wants.
I'm making the most forward progress on (1), but I haven't really sat down to write up a design doc. (2) is blocked by weird linker issues (as mentioned above).
The largest unmet milestone before really publishing and integrating (1) here is a formal support story for replacing the role of build.rs. I have a notion of an "override", which I think should do that job via dependency replacement, additional compiler flags, and environment variables, but I haven't got an example yet.
EDIT: I'm going to really clean up docs on both repos so you can get a better picture for the state of the world.
Awesome. Cargo raze sounds like the perfect solution. https://github.com/johnynek/bazel-deps/blob/master/README.md follows a similar approach for the jvm community. I would get held up on and 2 and 3. 1 solves for the majority of cases I think. At least in my usage of cargo. I only used build.rs as a stop gap for using serde o. stable rust but now that macro derives arrived I can use serde on stable without build.rs files.
Really appreciate the work your putting into this. I've had great experience with bazels remote build cache. I'm excited what that can do for rust productivity. never having to build the same rust code twice on any machine with the same arch will be awesome
As this gone anywhere lately? I see that the traffic on the Rust RFC has died down, but don't know how that influences the outcome of this issue.
I'm not really in a position to help push this along, though I would like to be able to use Rust and Bazel together with the Crates.io ecosystem.
I may not be the best person to push this forward. I've chased my own tail all the way to reinventing stackage for cargo[1] as an imagined prerequisite.
My thought process isn't super well grounded, but I found that the naive vendoring approach breaks down in practice quickly, as the cargo ecosystem is way too keen to pull down many versions of dependencies. It's also impossible to verify that your vendored dependencies actually work, as cargo doesn't lock dev_dependencies -- so you cant actually cargo test
(or equivalent) anything in the /vendor folder.
I've been working on this in a vacuum though, and if other people will find utility in a version of this that naively brings in dependencies, I can polish that up.
This gets more into thoughts on the ecosystem, but part of the reason why this is both a difficult problem, and particularly acute for Bazel integration is that a) The standard library is anemic (missing lots of stuff that is basically essential) b) Crates are, on average, way too small.
(a) means that we really need a solution to this problem before we can use bazel with any efficacy, and (b) means that we have to resolve a lot of crates for the average environment if we want to get anything done.
@acmcarther can you expand on the problems caused by naievely vendoring multiple versions of dependencies?
I imagine at a minimum the current rules don't handle multiple version of a crate in transitive dependencies.
Issue https://github.com/bazelbuild/rules_rust/pull/47 fixes 'multiple versions of the same crate'. With that and cargo-raze, it is possible to perform resolution for a single Cargo toml without much issue.
I don't have a cohesive reasoning for going down the "alternative resolution strategy" track except that:
1) I'd like to avoid multiple versions of dependencies to reduce the "found X.Struct, expected a different X.Struct", a problem I expect to be compounded in an environment where your whole org uses what is effectively a single Cargo.toml. 2) I'd like to be able to test the vendored dependencies (to make sure they function at all!). This is something I can't do with Cargo's default dependency resolution, as it doesn't include dev_dependencies. 3) Some crates I develop or work on will not work at all if they're brought in more than once. For example, I have a global flag library that does linker level hacks that won't work right if they're brought in twice (or more).
Not a great set of reasons. If you feel that your use case would directly benefit by polishing what already exists, I can prioritize that. I have a repo that demonstrates auto vendoring and BUILD generation here: cargo-raze-examples.
EDIT: s/hobby project/repo: I was going to link next_space_coop, but ended up linking the more up to date cargo-raze-examples.
I'll have to play with raze-one-toml before I ask you do do anything.
Is [2] addressed by any rust issues? It seems pretty important to be able to test the vendored crates.
TL;DR: No, there are no issues to my knowledge regarding locking dev_dependencies.
I believe that locking dev_dependencies would be considered a "breaking change", as it would change the existing output of resolution passes, and make some arrangements of crates that currently resolve become unresolvable.
@acmcarther raze solution works great so far. I wasn't using the forked rules (since I'm currently using my own fork), but haven't needed multiple versions of crates just yet.
I was a little surprised that rand
brought in fuschia
, but that makes some amount of sense and isn't to bothersome.
In general I think raze is a pretty reasonable way to do the cargo integration.
I think with https://github.com/bazelbuild/rules_rust/pull/505 seemingly around the corner it'd be worth restarting conversations here. It'd be fantastic to be able to have these rules fully integrated with rust-analyzer and have a way to use crate registries so users can take begin to take better advantage of the Rust ecosystem.
I'm not sure what the best approach would be for a cargo_crate
repository rule but I've started wondering if it'd be beneficial to move cargo-raze into these rules and begin to streamline it's use to hopefully get to a point where it no longer requires explicit human action. Maybe there's a decent path forward there?
Or perhaps when https://github.com/bazelbuild/bazel-gazelle/issues/938 is complete, it'd be possible to do something elegant with gazelle
?
Very interested in hearing more thoughts 😄
At Apple, we put together an implementation we've called rules_rust_external (after rules_jvm_external
). We'd love to contribute it as a potential base for this work.
We put it together quite quickly, so it has a few rough edges implementation-wise, but we think it has a decent UX, and we've been using it in anger for multiple non-trivial projects, including with dependencies on things like openssl, proc_macros, build scripts, and protobuf generation, on both OSX and Linux (we've not tried Windows, but have no reason to believe it wouldn't work).
We're not super attached to any aspects of it - names/structure/code/etc can change as needed, but we'd love to work together!
(I've just pushed the code to a branch - instructions for how to try it out are at the bottom of this comment)
In your WORKSPACE
file you add a crate_universe
call. This can point at zero or more Cargo.toml files, contain zero or more explicit crates to pull in, as well as allowing some customisation and overriding.
Simple examples:
Using one Cargo.toml
:
crate_universe(
name = "rust_deps",
cargo_toml_files = ["//:Cargo.toml"],
supported_targets = [
"x86_64-apple-darwin",
"x86_64-unknown-linux-gnu",
],
)
Using one Cargo.toml
and an additional crate, and which uses a lockfile to skip re-resolving if nothing has changed:
crate_universe(
name = "basic_rust_deps",
cargo_toml_files = ["//:Cargo.toml"],
lockfile = "//:lockfile.lock",
packages = [
crate.spec(
name = "libc",
semver = "=0.2.76",
),
],
supported_targets = [
"x86_64-apple-darwin",
"x86_64-unknown-linux-gnu",
],
)
Just one standalone crate:
crate_universe(
name = "basic_rust_deps",
packages = [
crate.spec(
name = "lazy_static",
semver = "1.4",
),
],
supported_targets = [
"x86_64-apple-darwin",
"x86_64-unknown-linux-gnu",
],
)
This creates a repository named after the crate_universe
, e.g. rust_deps
or basic_rust_deps
in those examples. Then in a BUILD file you can write something like:
Pull in individual crates as deps:
load("@basic_rust_deps//:defs.bzl", "crate")
rust_binary(
name = "basic",
srcs = ["src/main.rs"],
edition = "2018",
deps = [
crate("lazy_static"),
],
)
Pull in all crates as deps:
load("@basic_rust_deps//:defs.bzl", "all_deps", "all_proc_macro_deps")
rust_binary(
name = "basic",
srcs = ["src/main.rs"],
edition = "2018",
deps = all_deps(),
proc_macro_deps = all_proc_macro_deps(),
)
Pull in all crates from a specific Cargo.toml
:
load("@basic_rust_deps//:defs.bzl", "crates_from")
rust_binary(
name = "basic",
srcs = ["src/main.rs"],
edition = "2018",
deps = crates_from(":Cargo.toml"),
)
We have a repository rule which:
crate_universe
rule.cargo
(using the cargo
from your rules_rust
-configured toolchain) via cargo-raze
(as a library) to fetch assorted metadata.raze
would generate - we opt in a bunch of defaults to make most crates Just Work out of the box, e.g. enabling buildrs by default, adding more files as default data deps, etc.We've got a few pieces we'd like to improve:
crate_universe
, and a referenced Cargo.toml
or config in the WORKSPACE is changed, we hard error and tell you to re-generate the lockfile. Right now, this lockfile is keyed on all of the input, and its contents is the entire generated repository. Ideally, its contents would be only the result of the resolve done by cargo
, as that is the slow and non-deterministic thing, rather than all of the generated output. Also, ideally this would interoperate nicely with Cargo.lock
files, but that needs a bit more thought.rules_rust_external
, and we bootstrap in consumer repos by downloading a pre-built version of the binary which we publish for each of the platforms we care about. All of this is a little untidy...tera
, but a few of them are hard-coded string formats.rules_rust_external/examples/has_aliased_deps/openssl.BUILD
for an example). There is no easy way to share these in Bazel today, although the Bazel External Dependencies design may help here.Check out this branch, cd
into rules_rust_external/resolver
and run cargo build
, and export and env var: RULES_RUST_RESOLVER_URL_OVERRIDE=file:///path/to/resolver
.
There's a set of examples in rules_rust_external/examples
- you should be able to bazel build
and bazel test
in those directories (each of which is its own WORKSPACE). Each one shows something different; basic
is a hello world, uses_sys_crate
shows native compilation working in build scripts and uses a lockfile, uses_proc_macro
uses proc macros directly, and has_alised_deps
has a complex web of transitive native deps including openssl
built with rules_foreign_cc
, openssh
, and others, some of which are renamed.
I've put this together as a PR into rules_rust
, but it would probably be better placed as its own repository in bazelbuild
, like rules_jvm_external
is :)
This work builds on the amazing work already put into cargo-raze
, and of course rules_rust
, by a fantastic community - thanks for making such amazing software and being such a friendly, welcoming community!
/cc @dfreese @acmcarther @mfarrugi @smklein @damienmg
@illicitonion You're an absolute legend!! 🙏 This looks awesome!!
Can you elaborate on why you think this should be a separate repo? Not strongly in favor one way or another. Just wanna hear your thoughts
Can you elaborate on why you think this should be a separate repo? Not strongly in favor one way or another. Just wanna hear your thoughts
I don't have any strong reasons, but my general thinking is:
rules_rust
, and could reasonably have multiple implementations with different opinions (for instance, cargo-raze
today by default takes an "Avoid executing arbitrary code as part of your build" approach whereas rules_rust_external
today by default takes an "Avoid manual intervention" approach), and so keeping it separate feels like it makes way for others with other opinions to do their own thing in their own space.rules_rust
to not be in the business of producing binaries that need downloading. In particular, the release cadence is likely to be different - rules_rust_external
would want actual releases, and they wouldn't necessarily align with rules_rust
. (That said, maybe they should - for sure breaking changes in rules_rust
are likely to need a corresponding change in rules_rust_external
- maybe really what I'm saying is: We should work out rules_rust
's release strategy before deciding...).By way of precedent, rules_jvm_external
is separate from rules_java
(though rules_java
is a bit weird, being a wrapper around native bazel rules), and rules_python_external
ended up being merged into rules_python
.
1. The crates stuff is a fairly clean layer atop `rules_rust`, and could reasonably have multiple implementations with different opinions (for instance, `cargo-raze` today by default takes an "Avoid executing arbitrary code as part of your build" approach whereas `rules_rust_external` today by default takes an "Avoid manual intervention" approach), and so keeping it separate feels like it makes way for others with other opinions to do their own thing in their own space.
I don't think cargo-raze
necessarily takes a "Avoid executing arbitrary code as part of your build" approach. The only component of raze that comes to mind is default_gen_buildrs
which in https://github.com/bazelbuild/rules_rust/issues/490 I believe concluded that this should change. I have a branch to start this transition here (https://github.com/google/cargo-raze/pull/303). I otherwise feel that raze can easily be used in contexts where the focus is "avoid manual intervention".
2\. In particular, the release cadence is likely to be different - `rules_rust_external` would want actual releases, and they wouldn't necessarily align with `rules_rust`. (That said, maybe they should - for sure breaking changes in `rules_rust` are likely to need a corresponding change in `rules_rust_external` - maybe really what I'm saying is: We should work out `rules_rust`'s release strategy before deciding...).
100% agree 😄 https://github.com/bazelbuild/rules_rust/issues/415
3\. Ownership - I can believe that there may be separate groups of people with separate focuses and experience who may want to be committers to each project, and it's not obvious we should couple those rights and responsibilities.
Good point. I wonder if it'd be possible to add Bazel support (https://github.com/google/cargo-raze/issues/61) to cargo-raze
and move that repo to rules_rust_external
where your changes can be built ontop of it (since at a glance it looks like it could be, especially since you're using cargo_raze
as a library).
Not sure why the quotes got all funny in https://github.com/bazelbuild/rules_rust/issues/2#issuecomment-738933976
Anywho, super excited about rules_rust_external
and it's definitely shifted my perspective about some changes I'd planned for cargo-raze
. Agai, big thanks for all your work here @illicitonion 🙏
@illicitonion Does rules_rust_external
support registries other than crates.io
?
3\. Ownership - I can believe that there may be separate groups of people with separate focuses and experience who may want to be committers to each project, and it's not obvious we should couple those rights and responsibilities.
Good point. I wonder if it'd be possible to add Bazel support (google/cargo-raze#61) to
cargo-raze
and move that repo torules_rust_external
where your changes can be built ontop of it (since at a glance it looks like it could be, especially since you're usingcargo_raze
as a library).
Though the BUILD files aren't checked in in this code-dump, we do have rules_rust_external
building with bazel on both Linux and Mac, including statically linking openssl, so we have sufficient generated BUILD files to build cargo-raze
. So... This is plausible, but it brings us back to the bootstrapping question :)
@illicitonion Does rules_rust_external support registries other than crates.io?
We've tested that you can use mirrors for crate downloads, but we haven't tested alternative registries. My assumption is that the resolution code will work (because it delegates to cargo-metadata
, but that we have an assumption that all crates are downloaded from one place, so likely if you have a single alternate registry things will work, but if you use multiple registries we'd probably need to turn our repository_template
param into a registry_fetch_template
dict, and track which registry each crate comes from. I expect this to be relatively trivial, but I haven't worked with alternate registries enough to be confident in that claim.
This may also help solve some bootstrap problems in #421, so looking forward to the progress!
@illicitonion would you be able to update the bindgen, proto, and wasm-bindgen rules on your branch to use crate_universe
? To better demonstrate how this future would look?
To bootstrap cargo-raze into a bazel build we need to commit generated BUILD files once, and then subsequent raze invocations be done from the bazel binary. Am I missing anything? I don't recall why this wasn't done, so maybe it's no simple.
To bootstrap cargo-raze into a bazel build we need to commit generated BUILD files once, and then subsequent raze invocations be done from the bazel binary. Am I missing anything? I don't recall why this wasn't done, so maybe it's no simple.
I don't think I understand what you're saying here, can you elaborate? 😅
What BUILD files are you generating and committing?
A cargo build of cargo-raze should be able to generate the BUILD files for future builds of cargo-raze.
On Wed, Dec 9, 2020 at 9:08 PM UebelAndre notifications@github.com wrote:
To bootstrap cargo-raze into a bazel build we need to commit generated BUILD files once, and then subsequent raze invocations be done from the bazel binary. Am I missing anything? I don't recall why this wasn't done, so maybe it's no simple.
I don't think I understand what you're saying here, can you elaborate? 😅
What BUILD files are you generating and committing?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/bazelbuild/rules_rust/issues/2#issuecomment-742187244, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAROG6CPECUQ32POMFXI7YDSUAUS7ANCNFSM4B6UAAJQ .
Ah yeah, that's correct. I was planning on setting that up this weekend in that project but am waiting on some other PRs to go through (some issues related to TravisCI require something be done about CI and moving to Building with Bazel seems to make sense).
@illicitonion would you be able to update the bindgen, proto, and wasm-bindgen rules on your branch to use
crate_universe
? To better demonstrate how this future would look?
I realistically don't have enough knowledge of any of those pieces to put together something meaningful here... Can you point me at an example today where you have to do something you don't like, which you think this could help with?
I realistically don't have enough knowledge of any of those pieces to put together something meaningful here... Can you point me at an example today where you have to do something you don't like, which you think this could help with?
I don't think you really need any knowledge of those sections of the rules. Each demonstrates some unique behavior or interaction that is currently solved by Cargo raze. If you have an understanding of Cargo raze, then I think you have all you need to try and replace the use of cargo-raze
with cargo_universe
there.
The one I'm most interested in is rust_wasm_bindgen
I think as of https://github.com/google/cargo-raze/commit/0d7637108fd128984ab5e06dcab886922a9ffe9b (release 0.9.0) the functionality missing from the library is now there, unblocking some of the exploration needed for cargo_universe
.
I just pushed an additional commit to the branch which pulls in the latest cargo-raze, both fixing some issues and meaning we're no longer using a fork :)
Disclaimer, I have very little experience with raze and in general with the repository rules that fetch stuff from package managers like rules_jvm_external
. I also see that a lot of thought went into this so I'm happy approve the PR if you tell me it solves all the problems.
Some questions of an uninformed reader: 1) How are build.rs scripts handled? Is there a way override their output/override the rust targets generated? 2) Are build.rs scripts compatible with remote execution (for example my remote execution worker has different system packages installed) and cross compilation (for example I want to build a linux binary from my mac)? 3) Is the approach compatible with rust analyzer? Can we hope to have a way to automatically generate rust analyzer config for the entire workspace? 4) Does anybody know what Buck does? I believe they have a similar solution, but it might be useful to join forces/get inspiration from them? 5) Cargo will in general be fine with multiple versions of a single crate in the build, is that something we want to follow or we want to be more opinionated? What's the error message in the latter case?
- How are build.rs scripts handled? Is there a way override their output/override the rust targets generated?
There is already a cargo_build_script
target in rules_rust
, with a custom environment runner which re-implements much of Cargo's build script handling. This ruleset generates these targets, and so supports whatever is supported by that rule, which is roughly:
- Are build.rs scripts compatible with remote execution (for example my remote execution worker has different system packages installed)
This is in the control of the rule author. Our private RBE deployment has a very minimal remote execution image, and we build system-like packages either using support for compiling them in -sys crates, or using things like rules_foreign_cc
. We then put the outputs in places that build.rs will find them (generally by setting an env var) - you can see an example in https://github.com/bazelbuild/rules_rust/blob/main/examples/complex_sys/raze/remote/BUILD.openssl-sys-0.9.60.bazel where we set OPENSSL_DIR
to point at where openssl was built with rules_foreign_cc.
If folks want to have some run-time environment installed on their remote execution systems and use that, it's up to them to make sure it's compatible with wherever they're going to be deploying.
In practice, we've used the minimal environment approach to build things like brotoli, bzip2, fsevent, inotify, libgit2, openssh, openssl, ring, ...
and cross compilation (for example I want to build a linux binary from my mac)?
We haven't experimented with this, but because everything uses platforms and toolchains, things should work properly, as long as the toolchains are properly configured. But I can't vouch for it personally through experience :)
- Is the approach compatible with rust analyzer? Can we hope to have a way to automatically generate rust analyzer config for the entire workspace?
We haven't tried, but it should be - there's no magic here, it's just generating rust_library
/cargo_build_script
invocations. We may discover some edge-cases around rust-analyzer support assuming files will be found inside the main workspace and needing to handle/add ../external
in some paths somewhere, but there shouldn't be anything fundamentally hard here :)
- Does anybody know what Buck does? I believe they have a similar solution, but it might be useful to join forces/get inspiration from them?
I don't know - https://buck.build/ only mentions rust_{binary,library,test} rules, and a way of pulling in a .rlib.
I know I've seem talk on internals.rust-lang.org of consuming the output of cargo metadata
or similar (as cargo-raze does), but I can't find any documentation on the website or references in Buck's codebase.
It's worth noting, though, that this branch really just adds a wrapper around cargo-raze
to execute as a repository rule - any sharing should probably be done at the cargo-raze
library layer :)
- Cargo will in general be fine with multiple versions of a single crate in the build, is that something we want to follow or we want to be more opinionated? What's the error message in the latter case?
Right now, within each universe, the implementation allows for one version per cargo-style configuration, so you can have independent versions in normal/build/dev deps, but within each configuration your version requirements need to be compatible and exactly one will be chosen. You can theoretically create multiple universes each of which contains a different version (though right now it's possible this will lead to naming clashes for the repositories we generate, but that's easily fixed) by doing something like:
crate_universe(
name = "rust_deps_old_libc",
packages = [
crate.spec(
name = "libc",
semver = "=0.2.76",
),
],
)
crate_universe(
name = "rust_deps_new_libc",
packages = [
crate.spec(
name = "libc",
semver = "=0.2.86",
),
],
)
I'm happy approve the PR if you tell me it solves all the problems.
🎉 Thanks!
I think the big issue we need to resolve before PRing is how to bootstrap. We need a compiled rust binary, which has many dependencies, in order to invoke our repository rule. Internally, we've been building one on CI as a merge hook, and having a function in the workspace create an http_file
for each operating system we build for. Something like this could work in rules_rust, but would require us to have somewhere to publish to, and need some machinery to do the build and publish.
Alternatively, as we've been talking about releases elsewhere, we could add "build binaries and put them in a tar" to the release process, but including per-platform compiled binaries would make it a sizeable tar...
Thoughts very welcome here!
Once we've worked that out, I'll put together a PR so we can do a proper review. Because it's a bit of a code-dump, as noted above there are a few areas where there are a few hacks in place, but it'd be great if we could tidy those up after-the-fact in small targeted reviews, rather than trying to get everything perfect up-front, if that's ok with folks!
Something like this could work in rules_rust, but would require us to have somewhere to publish to, and need some machinery to do the build and publish.
I believe other rules like rules_docker (maybe rules_go and bazel-gazelle too?) build and publish binaries into https://storage.googleapis.com , hopefully we can set up something similar for the rust_external resolver:
if "go_puller_darwin" not in excludes:
http_file(
name = "go_puller_darwin",
executable = True,
sha256 = "4855c4f5927f8fb0f885510ab3e2a166d5fa7cde765fbe9aec97dc6b2761bb22",
urls = [("https://storage.googleapis.com/rules_docker/" + RULES_DOCKER_GO_BINARY_RELEASE + "/puller-darwin-amd64")],
)
I think the CI config is:
# Step: push the puller release binary for Linux AMD64
- name: "gcr.io/cloud-builders/gsutil"
args:
- "cp"
- "-a"
- "public-read"
- "bazel-bin/container/go/cmd/puller/puller_/puller"
- "gs://rules_docker/$COMMIT_SHA/puller-linux-amd64"
id: "push-puller-linux-amd64"
waitFor: ["build-linux-amd64"]
I'd love to see some of the built-in rules get converted to use cargo_universe
in a branch before merging. I think folks are generally familiar with at least one of the unique rules in rules_rust
that rely on external crates (rust_wasm_bindgen
, rust_proto_library
, etc...) and it'd be much easier to understand what's going on by looking at a 1-1 mapping of how this is used. I'm personally most interested in rust_wasm_bindgen
and the complex_sys
example since I think they exercise enough common functionality.
edit. Actually, the has_aliased_deps
example covers what I wanted to see from complex_sys
Though that said, I'd love to be able to utilize the existing Cargo.lock
file and not have to generate some other .bzl
file. @illicitonion and I discussed this a bit and I need to do some more digging myself but I have a feeling that might be possible and a realistic thing to do. The concern with using the lockfile is about regeneration but I think if users add the Cargo.lock
file as a dependency to the repository rule and the rule returns a dict of it's attributes then it'll only get re-run when the lockfile changes. This might not be the case though and maybe the need for a sha256
is also required? Can repository rules detect changes to their label attributes? Maybe @hlopko might know?
Though that said, I'd love to be able to utilize the existing Cargo.lock file and not have to generate some other .bzl file.
Not generating a .bzl file is one of the improvements we agreed on wanting as well, so 👍 to that.
From our point of view it would be nice to land this if it's useful, experimentally maybe, so we can iterate here instead of maintaining this as a branch and also internally.
I think that's fine then. My only asks would be to have detailed documentation on the starlark and rust code that went into this and move the rule into cargo/universe
or something.
I'd love to see some of the built-in rules get converted to use
cargo_universe
in a branch before merging. I think folks are generally familiar with at least one of the unique rules inrules_rust
that rely on external crates (rust_wasm_bindgen
,rust_proto_library
, etc...) and it'd be much easier to understand what's going on by looking at a 1-1 mapping of how this is used. I'm personally most interested inrust_wasm_bindgen
and thecomplex_sys
example since I think they exercise enough common functionality.
I just updated examples/complex_sys
to the HEAD of https://github.com/illicitonion/rules_rust/tree/rules_rust_external :)
I think it's time to open a PR for this then 😄 So we can appropriately comment on the changes.
Add a
cargo_crate
repository rule for fetching Rust crates from Cargo.io.