Add cargo_crate repository rule

davidzchen commented 8 years ago

Add a cargo_crate repository rule for fetching Rust crates from Cargo.io.

mystal commented 8 years ago

I would love to help out with this feature.

I am very interested in using Bazel to build projects with Rust code and Cargo support is a must-have. Is the linked document about repository rules the best place to start, or are there other good documents to read?

davidzchen commented 8 years ago

Awesome! The document linked above is the original design doc for Skylark repository rules. We now have documentation for repository rules as well.

I took a brief look at the cargo tool and was not able to find a way to specifically download a single crate; the cargo commands are rather high-level and for end-to-end use cases. We would have to put together our own tool for fetching crates, which should be rather straightforward using the Crates.io API (which does not seem to be publicly documented):

❯❯❯ http get https://crates.io/api/v1/crates/acc_reader
HTTP/1.1 200 OK
Connection: keep-alive
Content-Encoding: gzip
Content-Type: application/json; charset=utf-8
Date: Mon, 25 Apr 2016 22:34:51 GMT
Server: nginx
Set-Cookie: cargo_session=--PBVtejYpqgihoNU4gy+Z6jCDXMg=; HttpOnly; Secure; Path=/
Strict-Transport-Security: max-age=31536000
Transfer-Encoding: chunked
Via: 1.1 vegur

{
    "crate": {
        "created_at": "2016-01-30T22:53:11Z",
        "description": "A wrapper for std::io::Read providing std::io::Seek",
        "documentation": "http://netvl.github.com/acc_reader/",
        "downloads": 122,
        "homepage": null,
        "id": "acc_reader",
        "keywords": [
            "seek",
            "reader",
            "input"
        ],
        "license": "MIT/Apache-2.0",
        "links": {
            "owners": "/api/v1/crates/acc_reader/owners",
            "reverse_dependencies": "/api/v1/crates/acc_reader/reverse_dependencies",
            "version_downloads": "/api/v1/crates/acc_reader/downloads",
            "versions": null
        },
        "max_version": "2.0.0",
        "name": "acc_reader",
        "repository": "https://github.com/netvl/acc_reader",
        "updated_at": "2016-02-01T21:10:08Z",
        "versions": [
            21719,
            21622
        ]
    },
    "keywords": [
        {
            "crates_cnt": 1,
            "created_at": "2016-01-30T22:53:11Z",
            "id": "seek",
            "keyword": "seek"
        },
        {
            "crates_cnt": 8,
            "created_at": "2015-01-10T22:21:22Z",
            "id": "reader",
            "keyword": "reader"
        },
        {
            "crates_cnt": 19,
            "created_at": "2014-12-12T13:53:10Z",
            "id": "input",
            "keyword": "input"
        }
    ],
    "versions": [
        {
            "crate": "acc_reader",
            "created_at": "2016-02-01T21:10:08Z",
            "dl_path": "/api/v1/crates/acc_reader/2.0.0/download",
            "downloads": 111,
            "features": {},
            "id": 21719,
            "links": {
                "authors": "/api/v1/crates/acc_reader/2.0.0/authors",
                "dependencies": "/api/v1/crates/acc_reader/2.0.0/dependencies",
                "version_downloads": "/api/v1/crates/acc_reader/2.0.0/downloads"
            },
            "num": "2.0.0",
            "updated_at": "2016-02-01T21:10:08Z",
            "yanked": false
        },
        {
            "crate": "acc_reader",
            "created_at": "2016-01-30T22:53:11Z",
            "dl_path": "/api/v1/crates/acc_reader/1.0.0/download",
            "downloads": 11,
            "features": {},
            "id": 21622,
            "links": {
                "authors": "/api/v1/crates/acc_reader/1.0.0/authors",
                "dependencies": "/api/v1/crates/acc_reader/1.0.0/dependencies",
                "version_downloads": "/api/v1/crates/acc_reader/1.0.0/downloads"
            },
            "num": "1.0.0",
            "updated_at": "2016-01-30T22:53:11Z",
            "yanked": false
        }
    ]
}

Ideally, it would be nice to reuse some of the code in Cargo itself, but I haven't looked into the implementation in enough detail to know how feasible it is.

mystal commented 8 years ago

Thanks for the information! I've been looking at Cargo's libraries to see what code we can reuse. It'd be really nice if I could write a cargo command to do what we need.

mystal commented 8 years ago

There is a third-party Cargo subcommand cargo clone that can fetch a crate's source from crates.io (it also sounds like it may get included in future Cargo releases: link). I figure we can install the subcommand via cargo install and use it to fetch a cargo_crate's source code.

acmcarther commented 7 years ago

Bikeshed: I might prefer this be called "cargo_source" or "cargo_crate_source", to indicate that these are not valid targets for an arbitrary rust_library to depend on.

davidzchen commented 7 years ago

Good point. The word crate in rust parlance seems to more accurately refer to compiled binaries rather than the source code. Let's go with cargo_source.

Would you like to take this one since you already have code for fetching crates in your cargo2bazel tool? Having rules call a Python command line tool is completely fine and is the approach taken by the pkg_* and Skydoc build rules.

acmcarther commented 7 years ago

Yeah, I can take this issue.

davidzchen commented 7 years ago

Thanks!

acmcarther commented 7 years ago

Update:

TL;DR: [cargo_source + WORKSPACE codegen + BUILD codegen] isn't likely to pan out as well as a hack around cargo, at least in the short term

I don't have a PR up yet, but I'm currently spiking something more like a cargo_toml repository rule. Currently, I'm not confident that a cargo_source is a good seam between cargo and bazel, because it forces us to do a bunch of the work that cargo already does (some of which I did in cargo2bazel). I think we'll always be playing catch up unless we let them handle dependency resolution end to end, which is very hard to do at the source level of granularity.

This line of reasoning was spurned initially when I was exploring repository rules in the abstract -- and I realized that these cargo_sources would have to be namespaced independently. For example, a "libc" import would have to live somewhere in "@libc//...:libc", while a "clap" dependency would have to live in "@clap//...:clap". This could quickly get pretty confusing if you're automatically generating these and end up with more than one version of a dependency (very common!). And, at the end of the day, the workspace probably doesn't care about most of these! We don't want visibility into transitive dependencies, or to handle the complexity of feature flags for transitive dependencies.

I recall that kythe is using something quick and dirty that calls cargo directly. I'd bet we could support a 100% integration into cargo by wrapping and cleaning that up, since cargo already builds targets that our own rust_library rules can consume.

All of that said, we could always go the whole way, and replicate as much of cargo's functionality as possible, generate cargo sources, multiple build rules for each feature flag subset, and handle target platforms, but I'm not confident that's the fastest (or most reasonable) approach.

davidzchen commented 7 years ago

cargo_source is just one piece of the solution. It just gives us a workspace rule that can be used to download the source for a single crate for building. The larger piece would be the proposed cargo raze tool, which will use Cargo's code to generate BUILD and WORKSPACE files based on the dependency graph constructed by Cargo.

For calling cargo directly, do you mean doing so for the most troublesome crates, such as -sys crates? For that case, it may be a good stopgap measure. However, as discussed in rust-lang/rust-roadmap#12, the problem with simply having the rules call cargo in general is that builds would not be able to be hermetic and we would not be able to take advantage of the features offered by Bazel.

I think the cargo_source + cargo-raze approach is feasible and the better solution for integrating Rust with Bazel in the long term. Perhaps for now, calling cargo directly for troublesome crates in the most troublesome leaves of the dependency graph that we cannot otherwise cover would be a good stopgap measure. If there is a way to tell cargo to not download dependencies but rather provide paths to the dependencies, that might be better. The multiple dependencies issue should be fixed once @kchodorow's recursive WORKSPACE file parsing lands. Implementing this approach for Bazel, as well as similar efforts being taken by others for integrating with other build systems, will also require changes to Cargo, such as surfacing more information about the dependency graph, build plan, etc., which is the purpose of the ongoing discussion on rust-lang/rust-roadmap#12. The feature requests that have been opened as a result of that discussion, such as rust-lang/cargo#3815 and rust-lang/cargo#3816 will help with implementing this approach as well.

Apologies for the delayed reply. I will post a more detailed proposal for the approach that I have min mind shortly.

acmcarther commented 7 years ago

My own apologies -- I haven't updated this with my latest stuff.

Agreed -- "cargo compile" probably isn't where we want to work from -- we should prefer "rust_library" generally. However, I'm still skeptical of "cargo_source" as a useful unit. It papers over the relationships between sources that are intrinsic to a "source", and really is synonymous with "new_http_archive". As mentioned before, I think having a unique external_repository for each source adds more complexity, and leads to more ambiguity than necessary, esp when multiple versions of a crate are involved. My impression is that we'll be fighting cargo a bit more than is useful.

I'll provide more feedback on a future doc, but I'm thinking we should lean on a cargo plugin to generate a single workspace with all dependencies. cargo-vendor is a good starting point there. After that, I think we're on the same page -- use another (or the same) cargo plugin to generate build rules and link the dependencies using cargo's own build planning. I don't think we need to wait for a special cargo change to do this, provided we have an appetite for cargo spelunking. I've already started on that myself.

In short, a cargo_toml(...) workspace rule accepts an existing toml and, does some cargo-vendor like behavior into a genfiles directory, and some custom code runs cargo's compile machinery to generate the build targets. We can map those targets to rust_library rules without much fuss.

Looking forward to seeing your proposal!

acmcarther commented 7 years ago

Heres a quick hack up of the cargo part of the workflow: https://github.com/acmcarther/cargo-raze-vendor

It generates WORKSPACES like this: https://github.com/acmcarther/cargo-raze-vendor/blob/prototype/cargo/WORKSPACE

I haven't built any of the bazel-side glue (cargo_toml, new_crate_repository, crate_repository), but I think they'll be easier to work through. The jist of it is that cargo_toml shells out to cargo-raze-vendor, and the *crate_repository rules work just like git_repository rules except that they know how to pull off of crates.io.

I should have a PR up specifically for this issue very soon (though it might be called new_crate_repository -- we can bikeshed on that when the PR is up).

EDIT:

Lets sync up some time soon. Theres some thinking behind this super roundabout workflow (bazel_rule -> cargo -> bazel_rules). Its my way of addressing target-specific dependencies -- defer the dependency planning.

EDIT2:

I'm not sure we have anything to gain from a cargo_source or new_crate_repository rule or anything like that. There's no additional things we can infer about the crate (because it depends on the crates around it). I've opted to use new_http_archive to pull crates, and build_file_content to specify the steps needed.

The fundamental issue, and the dissonance between what folks expect and what they'll get, is that a "crate" doesn't make sense in a vacuum. You need a set of desired crates and a resolver to nail down the precise dependencies, versions, and feature flags.

softprops commented 7 years ago

Im interested in helping on pushing this issue forward. What else needs to be done?

acmcarther commented 7 years ago

@softprops I'm still totally working on it and haven't abandoned it. I can give an update and see if there's something you think you might be able to explore as well:

TL;DR: cargo-raze is the cutting edge, with no great story for multiplat or build.rs files yet. rules_cargo is a totally different philosophy (and has awful docs). Both are pretty unpolished.

I've been exploring the solution space for the last couple of months. I've investigated: 1) cargo-free workflows. Use cargo to squeeze out whatever dependency information it knows about your crates, and synthesize BUILD files from that (maybe deferred). cargo-raze fits in here. 2) cargo tolerant workflows. Use cargo to build a crate or two, but write your code using rust_library. rules_cargo does this, but there are some hairy questions about propagating linked dependencies. I can explain more if you'd like here. 3) Full cargo workflows. Ditch rust_library completely. I don't think parallel build systems is in the "spirit" of bazel, so I've not dug too deep here. This also brings weird questions like "how do i use only one version of some C library", since Cargo will just bring in whatever it wants.

I'm making the most forward progress on (1), but I haven't really sat down to write up a design doc. (2) is blocked by weird linker issues (as mentioned above).

The largest unmet milestone before really publishing and integrating (1) here is a formal support story for replacing the role of build.rs. I have a notion of an "override", which I think should do that job via dependency replacement, additional compiler flags, and environment variables, but I haven't got an example yet.

EDIT: I'm going to really clean up docs on both repos so you can get a better picture for the state of the world.

softprops commented 7 years ago

Awesome. Cargo raze sounds like the perfect solution. https://github.com/johnynek/bazel-deps/blob/master/README.md follows a similar approach for the jvm community. I would get held up on and 2 and 3. 1 solves for the majority of cases I think. At least in my usage of cargo. I only used build.rs as a stop gap for using serde o. stable rust but now that macro derives arrived I can use serde on stable without build.rs files.

Really appreciate the work your putting into this. I've had great experience with bazels remote build cache. I'm excited what that can do for rust productivity. never having to build the same rust code twice on any machine with the same arch will be awesome

hwright commented 6 years ago

As this gone anywhere lately? I see that the traffic on the Rust RFC has died down, but don't know how that influences the outcome of this issue.

I'm not really in a position to help push this along, though I would like to be able to use Rust and Bazel together with the Crates.io ecosystem.

acmcarther commented 6 years ago

I may not be the best person to push this forward. I've chased my own tail all the way to reinventing stackage for cargo[1] as an imagined prerequisite.

My thought process isn't super well grounded, but I found that the naive vendoring approach breaks down in practice quickly, as the cargo ecosystem is way too keen to pull down many versions of dependencies. It's also impossible to verify that your vendored dependencies actually work, as cargo doesn't lock dev_dependencies -- so you cant actually cargo test (or equivalent) anything in the /vendor folder.

I've been working on this in a vacuum though, and if other people will find utility in a version of this that naively brings in dependencies, I can polish that up.

acmcarther commented 6 years ago

This gets more into thoughts on the ecosystem, but part of the reason why this is both a difficult problem, and particularly acute for Bazel integration is that a) The standard library is anemic (missing lots of stuff that is basically essential) b) Crates are, on average, way too small.

(a) means that we really need a solution to this problem before we can use bazel with any efficacy, and (b) means that we have to resolve a lot of crates for the average environment if we want to get anything done.

mfarrugi commented 6 years ago

@acmcarther can you expand on the problems caused by naievely vendoring multiple versions of dependencies?

I imagine at a minimum the current rules don't handle multiple version of a crate in transitive dependencies.

acmcarther commented 6 years ago

Issue https://github.com/bazelbuild/rules_rust/pull/47 fixes 'multiple versions of the same crate'. With that and cargo-raze, it is possible to perform resolution for a single Cargo toml without much issue.

I don't have a cohesive reasoning for going down the "alternative resolution strategy" track except that:

1) I'd like to avoid multiple versions of dependencies to reduce the "found X.Struct, expected a different X.Struct", a problem I expect to be compounded in an environment where your whole org uses what is effectively a single Cargo.toml. 2) I'd like to be able to test the vendored dependencies (to make sure they function at all!). This is something I can't do with Cargo's default dependency resolution, as it doesn't include dev_dependencies. 3) Some crates I develop or work on will not work at all if they're brought in more than once. For example, I have a global flag library that does linker level hacks that won't work right if they're brought in twice (or more).

Not a great set of reasons. If you feel that your use case would directly benefit by polishing what already exists, I can prioritize that. I have a repo that demonstrates auto vendoring and BUILD generation here: cargo-raze-examples.

EDIT: s/hobby project/repo: I was going to link next_space_coop, but ended up linking the more up to date cargo-raze-examples.

mfarrugi commented 6 years ago

I'll have to play with raze-one-toml before I ask you do do anything.

Is [2] addressed by any rust issues? It seems pretty important to be able to test the vendored crates.

acmcarther commented 6 years ago

TL;DR: No, there are no issues to my knowledge regarding locking dev_dependencies.

I believe that locking dev_dependencies would be considered a "breaking change", as it would change the existing output of resolution passes, and make some arrangements of crates that currently resolve become unresolvable.

mfarrugi commented 6 years ago

@acmcarther raze solution works great so far. I wasn't using the forked rules (since I'm currently using my own fork), but haven't needed multiple versions of crates just yet.

I was a little surprised that rand brought in fuschia, but that makes some amount of sense and isn't to bothersome.

In general I think raze is a pretty reasonable way to do the cargo integration.

UebelAndre commented 3 years ago

I think with https://github.com/bazelbuild/rules_rust/pull/505 seemingly around the corner it'd be worth restarting conversations here. It'd be fantastic to be able to have these rules fully integrated with rust-analyzer and have a way to use crate registries so users can take begin to take better advantage of the Rust ecosystem.

I'm not sure what the best approach would be for a cargo_crate repository rule but I've started wondering if it'd be beneficial to move cargo-raze into these rules and begin to streamline it's use to hopefully get to a point where it no longer requires explicit human action. Maybe there's a decent path forward there?

Or perhaps when https://github.com/bazelbuild/bazel-gazelle/issues/938 is complete, it'd be possible to do something elegant with gazelle?

Very interested in hearing more thoughts 😄

illicitonion commented 3 years ago

At Apple, we put together an implementation we've called rules_rust_external (after rules_jvm_external). We'd love to contribute it as a potential base for this work.

We put it together quite quickly, so it has a few rough edges implementation-wise, but we think it has a decent UX, and we've been using it in anger for multiple non-trivial projects, including with dependencies on things like openssl, proc_macros, build scripts, and protobuf generation, on both OSX and Linux (we've not tried Windows, but have no reason to believe it wouldn't work).

We're not super attached to any aspects of it - names/structure/code/etc can change as needed, but we'd love to work together!

(I've just pushed the code to a branch - instructions for how to try it out are at the bottom of this comment)

General usage:

In your WORKSPACE file you add a crate_universe call. This can point at zero or more Cargo.toml files, contain zero or more explicit crates to pull in, as well as allowing some customisation and overriding.

Simple examples:

Using one Cargo.toml:

crate_universe(
    name = "rust_deps",
    cargo_toml_files = ["//:Cargo.toml"],
    supported_targets = [ 
        "x86_64-apple-darwin",
        "x86_64-unknown-linux-gnu",
    ],  
)

Using one Cargo.toml and an additional crate, and which uses a lockfile to skip re-resolving if nothing has changed:

crate_universe(
    name = "basic_rust_deps",
    cargo_toml_files = ["//:Cargo.toml"],
    lockfile = "//:lockfile.lock",
    packages = [
        crate.spec(
            name = "libc",
            semver = "=0.2.76",
        ),
    ],
    supported_targets = [
        "x86_64-apple-darwin",
        "x86_64-unknown-linux-gnu",
    ],
)

Just one standalone crate:

crate_universe(
    name = "basic_rust_deps",
    packages = [ 
        crate.spec(
            name = "lazy_static",
            semver = "1.4",
        ),  
    ],  
    supported_targets = [ 
        "x86_64-apple-darwin",
        "x86_64-unknown-linux-gnu",
    ],  
)

This creates a repository named after the crate_universe, e.g. rust_deps or basic_rust_deps in those examples. Then in a BUILD file you can write something like:

Pull in individual crates as deps:

load("@basic_rust_deps//:defs.bzl", "crate")

rust_binary(
    name = "basic",
    srcs = ["src/main.rs"],
    edition = "2018",
    deps = [
        crate("lazy_static"),
    ],
)

Pull in all crates as deps:

load("@basic_rust_deps//:defs.bzl", "all_deps", "all_proc_macro_deps")

rust_binary(
    name = "basic",
    srcs = ["src/main.rs"],
    edition = "2018",
    deps = all_deps(),
    proc_macro_deps = all_proc_macro_deps(),
)

Pull in all crates from a specific Cargo.toml:

load("@basic_rust_deps//:defs.bzl", "crates_from")

rust_binary(
    name = "basic",
    srcs = ["src/main.rs"],
    edition = "2018",
    deps = crates_from(":Cargo.toml"),
)

Rough architecture

We have a repository rule which:

Merges any input from the crate_universe rule.
Runs cargo (using the cargo from your rules_rust-configured toolchain) via cargo-raze (as a library) to fetch assorted metadata.
Mixes in overrides, and platform-specific considerations.
Renders BUILD files for all of the crates - these are slightly different from the BUILD files raze would generate - we opt in a bunch of defaults to make most crates Just Work out of the box, e.g. enabling buildrs by default, adding more files as default data deps, etc.
Generates functions in the workspace for addressing specific crates, all crates of a type (e.g. proc macros), and all crates from specific Cargo.toml files.

Work still to do

We've got a few pieces we'd like to improve:

We optionally generate a lockfile - if a lockfile is specified on the crate_universe, and a referenced Cargo.toml or config in the WORKSPACE is changed, we hard error and tell you to re-generate the lockfile. Right now, this lockfile is keyed on all of the input, and its contents is the entire generated repository. Ideally, its contents would be only the result of the resolve done by cargo, as that is the slow and non-deterministic thing, rather than all of the generated output. Also, ideally this would interoperate nicely with Cargo.lock files, but that needs a bit more thought.
Bootstrapping. Right now we bootstrap in the repo by using a previous version of rules_rust_external, and we bootstrap in consumer repos by downloading a pre-built version of the binary which we publish for each of the platforms we care about. All of this is a little untidy...
Some expedient, and some just lazy, hacks in the code. e.g. most of our templates are generated using tera, but a few of them are hard-coded string formats.
~~Right now, there's quite a lot of logspam due to missing sha256s of the crate files. https://github.com/google/cargo-raze/pull/222 or similar would address this.~~ This has been fixed.
Crates which expect C/C++ dependencies and don't build them themself, such as openssl, need BUILD files to build the library, which generally requires writing a BUILD file (see rules_rust_external/examples/has_aliased_deps/openssl.BUILD for an example). There is no easy way to share these in Bazel today, although the Bazel External Dependencies design may help here.

How to try it out

Check out this branch, cd into rules_rust_external/resolver and run cargo build, and export and env var: RULES_RUST_RESOLVER_URL_OVERRIDE=file:///path/to/resolver.

There's a set of examples in rules_rust_external/examples - you should be able to bazel build and bazel test in those directories (each of which is its own WORKSPACE). Each one shows something different; basic is a hello world, uses_sys_crate shows native compilation working in build scripts and uses a lockfile, uses_proc_macro uses proc macros directly, and has_alised_deps has a complex web of transitive native deps including openssl built with rules_foreign_cc, openssh, and others, some of which are renamed.

I've put this together as a PR into rules_rust, but it would probably be better placed as its own repository in bazelbuild, like rules_jvm_external is :)

Acknowledgements

This work builds on the amazing work already put into cargo-raze, and of course rules_rust, by a fantastic community - thanks for making such amazing software and being such a friendly, welcoming community!

/cc @dfreese @acmcarther @mfarrugi @smklein @damienmg

UebelAndre commented 3 years ago

@illicitonion You're an absolute legend!! 🙏 This looks awesome!!

Can you elaborate on why you think this should be a separate repo? Not strongly in favor one way or another. Just wanna hear your thoughts

illicitonion commented 3 years ago

Can you elaborate on why you think this should be a separate repo? Not strongly in favor one way or another. Just wanna hear your thoughts

I don't have any strong reasons, but my general thinking is:

The crates stuff is a fairly clean layer atop rules_rust, and could reasonably have multiple implementations with different opinions (for instance, cargo-raze today by default takes an "Avoid executing arbitrary code as part of your build" approach whereas rules_rust_external today by default takes an "Avoid manual intervention" approach), and so keeping it separate feels like it makes way for others with other opinions to do their own thing in their own space.
Bootstrapping. I don't see a good way of bootstrapping that doesn't rely on publishing pre-compiled binaries, and it would be kind of nice for rules_rust to not be in the business of producing binaries that need downloading. In particular, the release cadence is likely to be different - rules_rust_external would want actual releases, and they wouldn't necessarily align with rules_rust. (That said, maybe they should - for sure breaking changes in rules_rust are likely to need a corresponding change in rules_rust_external - maybe really what I'm saying is: We should work out rules_rust's release strategy before deciding...).
Ownership - I can believe that there may be separate groups of people with separate focuses and experience who may want to be committers to each project, and it's not obvious we should couple those rights and responsibilities.

By way of precedent, rules_jvm_external is separate from rules_java (though rules_java is a bit weird, being a wrapper around native bazel rules), and rules_python_external ended up being merged into rules_python.

UebelAndre commented 3 years ago

1. The crates stuff is a fairly clean layer atop `rules_rust`, and could reasonably have multiple implementations with different opinions (for instance, `cargo-raze` today by default takes an "Avoid executing arbitrary code as part of your build" approach whereas `rules_rust_external` today by default takes an "Avoid manual intervention" approach), and so keeping it separate feels like it makes way for others with other opinions to do their own thing in their own space.

I don't think cargo-raze necessarily takes a "Avoid executing arbitrary code as part of your build" approach. The only component of raze that comes to mind is default_gen_buildrs which in https://github.com/bazelbuild/rules_rust/issues/490 I believe concluded that this should change. I have a branch to start this transition here (https://github.com/google/cargo-raze/pull/303). I otherwise feel that raze can easily be used in contexts where the focus is "avoid manual intervention".

2\. In particular, the release cadence is likely to be different - `rules_rust_external` would want actual releases, and they wouldn't necessarily align with `rules_rust`. (That said, maybe they should - for sure breaking changes in `rules_rust` are likely to need a corresponding change in `rules_rust_external` - maybe really what I'm saying is: We should work out `rules_rust`'s release strategy before deciding...).

100% agree 😄 https://github.com/bazelbuild/rules_rust/issues/415

3\. Ownership - I can believe that there may be separate groups of people with separate focuses and experience who may want to be committers to each project, and it's not obvious we should couple those rights and responsibilities.

Good point. I wonder if it'd be possible to add Bazel support (https://github.com/google/cargo-raze/issues/61) to cargo-raze and move that repo to rules_rust_external where your changes can be built ontop of it (since at a glance it looks like it could be, especially since you're using cargo_raze as a library).

UebelAndre commented 3 years ago

Not sure why the quotes got all funny in https://github.com/bazelbuild/rules_rust/issues/2#issuecomment-738933976

Anywho, super excited about rules_rust_external and it's definitely shifted my perspective about some changes I'd planned for cargo-raze. Agai, big thanks for all your work here @illicitonion 🙏

UebelAndre commented 3 years ago

@illicitonion Does rules_rust_external support registries other than crates.io?

illicitonion commented 3 years ago

3\. Ownership - I can believe that there may be separate groups of people with separate focuses and experience who may want to be committers to each project, and it's not obvious we should couple those rights and responsibilities.
Good point. I wonder if it'd be possible to add Bazel support (google/cargo-raze#61) to cargo-raze and move that repo to rules_rust_external where your changes can be built ontop of it (since at a glance it looks like it could be, especially since you're using cargo_raze as a library).

Though the BUILD files aren't checked in in this code-dump, we do have rules_rust_external building with bazel on both Linux and Mac, including statically linking openssl, so we have sufficient generated BUILD files to build cargo-raze. So... This is plausible, but it brings us back to the bootstrapping question :)

@illicitonion Does rules_rust_external support registries other than crates.io?

We've tested that you can use mirrors for crate downloads, but we haven't tested alternative registries. My assumption is that the resolution code will work (because it delegates to cargo-metadata, but that we have an assumption that all crates are downloaded from one place, so likely if you have a single alternate registry things will work, but if you use multiple registries we'd probably need to turn our repository_template param into a registry_fetch_template dict, and track which registry each crate comes from. I expect this to be relatively trivial, but I haven't worked with alternate registries enough to be confident in that claim.

nikhilm commented 3 years ago

This may also help solve some bootstrap problems in #421, so looking forward to the progress!

UebelAndre commented 3 years ago

@illicitonion would you be able to update the bindgen, proto, and wasm-bindgen rules on your branch to use crate_universe? To better demonstrate how this future would look?

mfarrugi commented 3 years ago

To bootstrap cargo-raze into a bazel build we need to commit generated BUILD files once, and then subsequent raze invocations be done from the bazel binary. Am I missing anything? I don't recall why this wasn't done, so maybe it's no simple.

UebelAndre commented 3 years ago

To bootstrap cargo-raze into a bazel build we need to commit generated BUILD files once, and then subsequent raze invocations be done from the bazel binary. Am I missing anything? I don't recall why this wasn't done, so maybe it's no simple.

I don't think I understand what you're saying here, can you elaborate? 😅

What BUILD files are you generating and committing?

mfarrugi commented 3 years ago

A cargo build of cargo-raze should be able to generate the BUILD files for future builds of cargo-raze.

On Wed, Dec 9, 2020 at 9:08 PM UebelAndre notifications@github.com wrote:

To bootstrap cargo-raze into a bazel build we need to commit generated BUILD files once, and then subsequent raze invocations be done from the bazel binary. Am I missing anything? I don't recall why this wasn't done, so maybe it's no simple.

I don't think I understand what you're saying here, can you elaborate? 😅

What BUILD files are you generating and committing?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/bazelbuild/rules_rust/issues/2#issuecomment-742187244, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAROG6CPECUQ32POMFXI7YDSUAUS7ANCNFSM4B6UAAJQ .

UebelAndre commented 3 years ago

Ah yeah, that's correct. I was planning on setting that up this weekend in that project but am waiting on some other PRs to go through (some issues related to TravisCI require something be done about CI and moving to Building with Bazel seems to make sense).

illicitonion commented 3 years ago

@illicitonion would you be able to update the bindgen, proto, and wasm-bindgen rules on your branch to use crate_universe? To better demonstrate how this future would look?

I realistically don't have enough knowledge of any of those pieces to put together something meaningful here... Can you point me at an example today where you have to do something you don't like, which you think this could help with?

UebelAndre commented 3 years ago

I realistically don't have enough knowledge of any of those pieces to put together something meaningful here... Can you point me at an example today where you have to do something you don't like, which you think this could help with?

I don't think you really need any knowledge of those sections of the rules. Each demonstrates some unique behavior or interaction that is currently solved by Cargo raze. If you have an understanding of Cargo raze, then I think you have all you need to try and replace the use of cargo-raze with cargo_universe there.

The one I'm most interested in is rust_wasm_bindgen

UebelAndre commented 3 years ago

I think as of https://github.com/google/cargo-raze/commit/0d7637108fd128984ab5e06dcab886922a9ffe9b (release 0.9.0) the functionality missing from the library is now there, unblocking some of the exploration needed for cargo_universe.

illicitonion commented 3 years ago

I just pushed an additional commit to the branch which pulls in the latest cargo-raze, both fixing some issues and meaning we're no longer using a fork :)

hlopko commented 3 years ago

Disclaimer, I have very little experience with raze and in general with the repository rules that fetch stuff from package managers like rules_jvm_external. I also see that a lot of thought went into this so I'm happy approve the PR if you tell me it solves all the problems.

Some questions of an uninformed reader: 1) How are build.rs scripts handled? Is there a way override their output/override the rust targets generated? 2) Are build.rs scripts compatible with remote execution (for example my remote execution worker has different system packages installed) and cross compilation (for example I want to build a linux binary from my mac)? 3) Is the approach compatible with rust analyzer? Can we hope to have a way to automatically generate rust analyzer config for the entire workspace? 4) Does anybody know what Buck does? I believe they have a similar solution, but it might be useful to join forces/get inspiration from them? 5) Cargo will in general be fine with multiple versions of a single crate in the build, is that something we want to follow or we want to be more opinionated? What's the error message in the latter case?

illicitonion commented 3 years ago

How are build.rs scripts handled? Is there a way override their output/override the rust targets generated?

There is already a cargo_build_script target in rules_rust, with a custom environment runner which re-implements much of Cargo's build script handling. This ruleset generates these targets, and so supports whatever is supported by that rule, which is roughly:

Running arbitrary Rust code in the style of a genrule.
Adding rustc_env, rustc flags, libraries to link, and link search paths, to direct and/or transitive dependees, and we do a little path manipulation so that absolute paths to things in the exec root end up pointing at the correct exec root for each action (e.g. in the case of sandboxes)
Having a scratch directory where build scripts can output files, which is propagated as an input for dependees (the idea being the build script will set rustc_env or link search paths pointing in that directory) - note that this uses a directory per target, whereas cargo uses a single directory for an entire build, but this hasn't been a problem much in practice

Are build.rs scripts compatible with remote execution (for example my remote execution worker has different system packages installed)

This is in the control of the rule author. Our private RBE deployment has a very minimal remote execution image, and we build system-like packages either using support for compiling them in -sys crates, or using things like rules_foreign_cc. We then put the outputs in places that build.rs will find them (generally by setting an env var) - you can see an example in https://github.com/bazelbuild/rules_rust/blob/main/examples/complex_sys/raze/remote/BUILD.openssl-sys-0.9.60.bazel where we set OPENSSL_DIR to point at where openssl was built with rules_foreign_cc.

If folks want to have some run-time environment installed on their remote execution systems and use that, it's up to them to make sure it's compatible with wherever they're going to be deploying.

In practice, we've used the minimal environment approach to build things like brotoli, bzip2, fsevent, inotify, libgit2, openssh, openssl, ring, ...

and cross compilation (for example I want to build a linux binary from my mac)?

We haven't experimented with this, but because everything uses platforms and toolchains, things should work properly, as long as the toolchains are properly configured. But I can't vouch for it personally through experience :)

Is the approach compatible with rust analyzer? Can we hope to have a way to automatically generate rust analyzer config for the entire workspace?

We haven't tried, but it should be - there's no magic here, it's just generating rust_library/cargo_build_script invocations. We may discover some edge-cases around rust-analyzer support assuming files will be found inside the main workspace and needing to handle/add ../external in some paths somewhere, but there shouldn't be anything fundamentally hard here :)

Does anybody know what Buck does? I believe they have a similar solution, but it might be useful to join forces/get inspiration from them?

I don't know - https://buck.build/ only mentions rust_{binary,library,test} rules, and a way of pulling in a .rlib.

I know I've seem talk on internals.rust-lang.org of consuming the output of cargo metadata or similar (as cargo-raze does), but I can't find any documentation on the website or references in Buck's codebase.

It's worth noting, though, that this branch really just adds a wrapper around cargo-raze to execute as a repository rule - any sharing should probably be done at the cargo-raze library layer :)

Cargo will in general be fine with multiple versions of a single crate in the build, is that something we want to follow or we want to be more opinionated? What's the error message in the latter case?

Right now, within each universe, the implementation allows for one version per cargo-style configuration, so you can have independent versions in normal/build/dev deps, but within each configuration your version requirements need to be compatible and exactly one will be chosen. You can theoretically create multiple universes each of which contains a different version (though right now it's possible this will lead to naming clashes for the repositories we generate, but that's easily fixed) by doing something like:

crate_universe(
    name = "rust_deps_old_libc",
    packages = [
        crate.spec(
            name = "libc",
            semver = "=0.2.76",
        ),
    ],
)

crate_universe(
    name = "rust_deps_new_libc",
    packages = [
        crate.spec(
            name = "libc",
            semver = "=0.2.86",
        ),
    ],
)

I'm happy approve the PR if you tell me it solves all the problems.

🎉 Thanks!

I think the big issue we need to resolve before PRing is how to bootstrap. We need a compiled rust binary, which has many dependencies, in order to invoke our repository rule. Internally, we've been building one on CI as a merge hook, and having a function in the workspace create an http_file for each operating system we build for. Something like this could work in rules_rust, but would require us to have somewhere to publish to, and need some machinery to do the build and publish.

Alternatively, as we've been talking about releases elsewhere, we could add "build binaries and put them in a tar" to the release process, but including per-platform compiled binaries would make it a sizeable tar...

Thoughts very welcome here!

Once we've worked that out, I'll put together a PR so we can do a proper review. Because it's a bit of a code-dump, as noted above there are a few areas where there are a few hacks in place, but it'd be great if we could tidy those up after-the-fact in small targeted reviews, rather than trying to get everything perfect up-front, if that's ok with folks!

gibfahn commented 3 years ago

Something like this could work in rules_rust, but would require us to have somewhere to publish to, and need some machinery to do the build and publish.

I believe other rules like rules_docker (maybe rules_go and bazel-gazelle too?) build and publish binaries into https://storage.googleapis.com , hopefully we can set up something similar for the rust_external resolver:

https://github.com/bazelbuild/rules_docker/blob/7da0de3d094aae5601c45ae0855b64fb2771cd72/repositories/repositories.bzl#L59-L65

    if "go_puller_darwin" not in excludes:
        http_file(
            name = "go_puller_darwin",
            executable = True,
            sha256 = "4855c4f5927f8fb0f885510ab3e2a166d5fa7cde765fbe9aec97dc6b2761bb22",
            urls = [("https://storage.googleapis.com/rules_docker/" + RULES_DOCKER_GO_BINARY_RELEASE + "/puller-darwin-amd64")],
        )

I think the CI config is:

https://github.com/bazelbuild/rules_docker/blob/7da0de3d094aae5601c45ae0855b64fb2771cd72/container/go/cloudbuild.yaml#L43-L52

# Step: push the puller release binary for Linux AMD64
  - name: "gcr.io/cloud-builders/gsutil"
    args:
      - "cp"
      - "-a"
      - "public-read"
      - "bazel-bin/container/go/cmd/puller/puller_/puller"
      - "gs://rules_docker/$COMMIT_SHA/puller-linux-amd64"
    id: "push-puller-linux-amd64"
    waitFor: ["build-linux-amd64"]

UebelAndre commented 3 years ago

I'd love to see some of the built-in rules get converted to use cargo_universe in a branch before merging. I think folks are generally familiar with at least one of the unique rules in rules_rust that rely on external crates (rust_wasm_bindgen, rust_proto_library, etc...) and it'd be much easier to understand what's going on by looking at a 1-1 mapping of how this is used. I'm personally most interested in rust_wasm_bindgen and the complex_sys example since I think they exercise enough common functionality.

edit. Actually, the has_aliased_deps example covers what I wanted to see from complex_sys

UebelAndre commented 3 years ago

Though that said, I'd love to be able to utilize the existing Cargo.lock file and not have to generate some other .bzl file. @illicitonion and I discussed this a bit and I need to do some more digging myself but I have a feeling that might be possible and a realistic thing to do. The concern with using the lockfile is about regeneration but I think if users add the Cargo.lock file as a dependency to the repository rule and the rule returns a dict of it's attributes then it'll only get re-run when the lockfile changes. This might not be the case though and maybe the need for a sha256 is also required? Can repository rules detect changes to their label attributes? Maybe @hlopko might know?

gibfahn commented 3 years ago

Though that said, I'd love to be able to utilize the existing Cargo.lock file and not have to generate some other .bzl file.

Not generating a .bzl file is one of the improvements we agreed on wanting as well, so 👍 to that.

From our point of view it would be nice to land this if it's useful, experimentally maybe, so we can iterate here instead of maintaining this as a branch and also internally.

UebelAndre commented 3 years ago

I think that's fine then. My only asks would be to have detailed documentation on the starlark and rust code that went into this and move the rule into cargo/universe or something.

illicitonion commented 3 years ago

I'd love to see some of the built-in rules get converted to use cargo_universe in a branch before merging. I think folks are generally familiar with at least one of the unique rules in rules_rust that rely on external crates (rust_wasm_bindgen, rust_proto_library, etc...) and it'd be much easier to understand what's going on by looking at a 1-1 mapping of how this is used. I'm personally most interested in rust_wasm_bindgen and the complex_sys example since I think they exercise enough common functionality.

I just updated examples/complex_sys to the HEAD of https://github.com/illicitonion/rules_rust/tree/rules_rust_external :)

UebelAndre commented 3 years ago

I think it's time to open a PR for this then 😄 So we can appropriately comment on the changes.

bazelbuild / rules_rust