facebookincubator / reindeer

Reindeer is a tool to transform Rust Cargo dependencies into generated Buck build rules
MIT License
184 stars 30 forks source link

some confusion with the latest reindeer #11

Closed steveklabnik closed 1 year ago

steveklabnik commented 1 year ago

I was trying out the latest reindeer, and I ran into something very confusing. I have a basic project, I'd like it to have a dependency on the semver crate. So I made third-party\Cargo.toml with this:

[workspace]

[package]
name = "rust-third-party"
version = "0.0.0"
publish = false
edition = "2021"

# Dummy target to keep Cargo happy
[[bin]]
name = "fake"
path = "/dev/null"

[dependencies]
semver = "1.0.17"

Since my intention was to not use vendoring, I created a reindeer.toml:

vendor = false

I didn't run reindeer vendor, but instead, ran reindeer buckify:

〉reindeer buckify
[ERROR reindeer] Failed to load C:\Users\steve\Documents\GitHub\buck-rust-hello\third-party\Cargo.lock

    Caused by:
        The system cannot find the file specified. (os error 2)

I guess this makes sense, but sure, I can also generate a lockfile easily:

〉cargo generate-lockfile

This produces a very sensible Cargo.lock:

# This file is automatically @generated by Cargo.
# It is not intended for manual editing.
version = 3

[[package]]
name = "rust-third-party"
version = "0.0.0"
dependencies = [
 "semver",
]

[[package]]
name = "semver"
version = "1.0.17"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "bebd363326d05ec3e2f532ab7660680f3b02130d780c299bca73469d521bc0ed"

But reindeer isn't happy:

〉reindeer buckify
[ERROR reindeer] parsing metadata

    Caused by:
        0: running cargo
        1: cargo metadata --format-version 1 --frozen --locked --offline --manifest-path C:\Users\steve\Documents\GitHub\buck-rust-hello\third-party\Cargo.toml failed:
           error: failed to get `semver` as a dependency of package `rust-third-party v0.0.0 (C:\Users\steve\Documents\GitHub\buck-rust-hello\third-party)`

           Caused by:
             failed to load source for dependency `semver`

           Caused by:
             Unable to update registry `crates-io`

           Caused by:
             failed to update replaced source registry `crates-io`

           Caused by:
             failed to read root of directory source: C:\Users\steve\Documents\GitHub\buck-rust-hello\third-party\vendor

           Caused by:
             The system cannot find the path specified. (os error 3)

Oh, it still wants a vendor directory. I'll make one:

〉mkdir vendor
〉reindeer buckify
[ERROR reindeer] parsing metadata

    Caused by:
        0: running cargo
        1: cargo metadata --format-version 1 --frozen --locked --offline --manifest-path C:\Users\steve\Documents\GitHub\buck-rust-hello\third-party\Cargo.toml failed:
           error: no matching package named `semver` found
           location searched: registry `crates-io`
           required by package `rust-third-party v0.0.0 (C:\Users\steve\Documents\GitHub\buck-rust-hello\third-party)`
           As a reminder, you're using offline mode (--offline) which can sometimes cause surprising resolution failures, if this error is too confusing you may wish to retry without the offline flag.

So that doesn't make too much sense either. I'm trying not to vendor!

Now what I can do is run reindeer vendor, then reindeer buckify, which does do the right thing. I still have to delete the contents of vendor\semver though, only after a reindeer buckify, not before.

Anyway I guess my point here is, it might make sense for something to set up third-party\.cargo somewhere other than reindeer vendor, given that it's kinda weird to run a command named after a thing I'm trying not to do :) Or also, maybe just explicitly documenting the no-vendoring case, so that it's known that you should do it anyway.

jsgf commented 1 year ago
vendor = false

[vendor] is supposed to be a dict of vendoring options.

I don't think not vendoring is an option, since it needs the sources locally to reference from the BUCK file. I guess in principle it could reach into cargo's dirs/caches to get access, but that seems very fragile.

Since Buck is purely a build tool, it relies on Cargo to do the version/feature dependency resolution process. Vendoring + metadata.json is the product of that which Reindeer consumes along with fixups to generate the BUCK rules.

We have a wrapper script which does vendor+buckify together; the main value of standalone buckify is so you can iterate on the fixups.toml files without having to re-vendor (which is unaffected by fixups).

Why do you want to avoid vendoring?

steveklabnik commented 1 year ago

https://github.com/facebookincubator/reindeer/commit/372b43dc4eba4725d9b1fe1b440b0486a38c5959

I'm using this feature, which does work, to be clear, just the setup for it ends up being weird.

I don't think not vendoring is an option, since it needs the sources locally to reference from the BUCK file.

The above commit makes reindeer generate http_archive calls to fetch the .crate from crates.io, and cache things that way. The commit has a good example, but here's what mine looks like after the above:

alias(
    name = "semver",
    actual = ":semver-1.0.17",
    visibility = ["PUBLIC"],
)

http_archive(
    name = "semver-1.0.17.crate",
    sha256 = "bebd363326d05ec3e2f532ab7660680f3b02130d780c299bca73469d521bc0ed",
    strip_prefix = "semver-1.0.17",
    urls = ["https://crates.io/api/v1/crates/semver/1.0.17/download"],
    visibility = [],
)

rust_library(
    name = "semver-1.0.17",
    srcs = [":semver-1.0.17.crate"],
    crate = "semver",
    crate_root = "semver-1.0.17.crate/src/lib.rs",
    edition = "2018",
    features = [
        "default",
        "std",
    ],
    visibility = [],
)

rust_binary(
    name = "semver-1.0.17-build-script-build",
    srcs = [":semver-1.0.17.crate"],
    crate = "build_script_build",
    crate_root = "semver-1.0.17.crate/build.rs",
    edition = "2018",
    features = [
        "default",
        "std",
    ],
    visibility = [],
)

Since Buck is purely a build tool, it relies on Cargo to do the version/feature dependency resolution process.

Right, that's why you need a Cargo.lock that Cargo generates. Not saying this should change. Just saying that this work is done under a command that's named for some of the other work I don't care to have done :) It's kinda like cargo build in that way actually, there's cargo build but also cargo generate-lockfile, and if you just want to generate a lockfile and not build, you run the latter, not the former. In the same way, I don't want my sources vendored, but I do want Cargo to figure out which ones I need and why, so I don't want to run reindeer vendor, I want to run reindeer make-cargo-figure-stuff-out (name needs workshopping 😆).

Why do you want to avoid vendoring?

It's just not the norm for me. It's also not the default for Cargo, so I feel like doing things this way is less strange to someone coming from Cargo. Basically, it's not a hard requirement, it's just vibes. But vibes matter :)

jsgf commented 1 year ago

Oh right I should really keep up with what David's been doing rather than make authoritative statements based on old info...

But it is definitely not the way we use it internally so it doesn't surprise me that there's rough spots. buckify does lock down cargo since it assumes vendor has done all the network IO that needs doing. So I think there's a legit bug here that buckify should not do that for the non-vendor case.

cc @zertosh @dtolnay

dtolnay commented 1 year ago

After https://github.com/facebookincubator/reindeer/commit/a7ee0d5b592b86a7b76bf3156e50967c0f2c1cc5, it should now be possible to skip vendor entirely.

The original vendor = false option was written for my exact use case in cxx, which is I wanted to ship a ready-to-go third-party/BUCK file where someone could clone the cxx repo and immediately do buck builds in it. They would not need to install reindeer or run cargo vendor or run buckify. For that outcome I didn't care if the crates were vendored on my machine as the person generating the BUCK file. I could just put the vendor directory in .gitignore.

Now, as long as you have Cargo.toml, reindeer.toml, and the right fixups, you can run reindeer buckify without either of Cargo.lock or vendor directory existing. (But if Cargo.lock exists, it will be respected.)

See https://github.com/dtolnay/cxx/commit/b5823078a8363e87ff1924638185027811b22474 which removes the cargo vendor pre-reindeer buckify from cxx.

Depending on what fixups you use, by itself this commit isn't going to be sufficient because there are a bunch of places in reindeer that assume all the manifest directories of all the transitive dependencies are subdirectories of the directory containing reindeer.toml, i.e. in a vendor directory as opposed to ~/.cargo. For example:

https://github.com/facebookincubator/reindeer/blob/a7ee0d5b592b86a7b76bf3156e50967c0f2c1cc5/src/buckify.rs#L289-L292

https://github.com/facebookincubator/reindeer/blob/a7ee0d5b592b86a7b76bf3156e50967c0f2c1cc5/src/fixups.rs#L1005-L1010

so you might encounter wonky targets such as:

rust_library(
    ...
    licenses = ["../../../home/steveklabnik/.cargo/registry/src/index.crates.io-6f17d22bba15001f/owo-colors-3.5.0/LICENSE"],
)

I can try to audit all those but it would be helpful if you point out any cases where this happens to you.

steveklabnik commented 1 year ago

This works great! Thank you!

I can try to audit all those but it would be helpful if you point out any cases where this happens to you.

Absolutely happy to. We'll see what I run into, my toolchain weirdness is still... weird, so I'm mostly in "try random stuff out and see how it goes" mode.