facebook / buck2

Build system, successor to Buck
https://buck2.build/
Apache License 2.0
3.51k stars 214 forks source link

Using only Cargo files for Rust/Cargo rule #290

Open danmx opened 1 year ago

danmx commented 1 year ago

Would it be possible to leverage dynamic dependencies to generate Buck dependencies so Reindeer is no longer needed?

ndmitchell commented 1 year ago

I'm not going to say its impossible, but I haven't been able to figure it out, and I did give it a shot. Because dynamic dependencies are restricted to being within a target, you have to basically have each Rust target become a subtarget of a generic rule - which isn't impossible. You really want to reuse the existing Rust implementation, which means making it an anon_targets to reuse it, again not impossible. But when I started trying I failed, partly because anon_targets wasn't complete enough. Given there are improvements coming in that area, I was going to take another stab at it after.

But I didn't see any fundamental reason not to do it this way, and I think it would be quite elegant. Feel free to give it a go!

cjhopman commented 1 year ago

@ndmitchell were you sort of looking at this for writing rust rules in general?

For the usecase of reindeer where it's used to manage third-party dependencies, I think it should be fairly easy to get like a dynamic_reindeer() or whatever rule that required you to point it at a Cargo.toml file and to explicitly list out all the names of crates that you want to expose and then have subtargets for all of those crates (and then maybe implemented via dynamic_output and running reindeer at build time, but i haven't looked at reindeer's implementation too much... could also feasibly be implemented by using cargo to build, but idk).

djc commented 1 year ago

I'm also interested in using Cargo.toml/Cargo.lock files to generate BUCK files from. I have a feeling Reindeer supports some of this but also still requires having more of a custom environment where I move all of my crates.io dependencies into a separate third-party Cargo manifest. I think a nice workflow (that might be easier to achieve than something fully dynamic?) would be to have tooling that generates a complete Buck environment from a set of Cargo.toml and Cargo.lock files, for now requiring the user to regenerate this environment manually (bonus points for a check command that lets CI check that the BUCK config is up to date with the Cargo input files). Also a simplifying assumption might be that workspace dependencies are used for all non-repo-local dependencies.

I'm also wondering if it would be possible to rely on the default location of downloaded crate data -- given that we have hashes in Cargo.lock it feels like we could use the normal user dir for this stuff, instead of replicating a .cargo/registry directory like Reindeer is currently doing.

I'd be open to writing code to get this working, but would likely need some help to bootstrap the MVP of this. At work we're currently suffering from "long" build times (about an hour to build production Docker images, many of which really just contain a single Rust binary), with about 750 crates pulled from crates.io. (I'm sure this is small by Meta standards, but would still be great to improve on the current situation.)

Is there some chat venue that Buck2 people hang out in? Discord/Zulip/Matrix? @steveklabnik you mentioned a Discord in https://steveklabnik.com/writing/using-cratesio-with-buck but the invite there seems to have expired -- anything happening there?

Playing with reindeer buckify a bit more, it looks like it already does a bunch of this stuff, but maybe the main sticking point is build scripts? It looks like it tries to do fixups for scenarios that it understands already, would it be possible to have some fallback thing where the build script is executed in a sandboxed environment to get the last mile done? And how does Meta deal with this for crates like serde_json, serde and thiserror which seem highly likely to be used?

steveklabnik commented 1 year ago

@steveklabnik you mentioned a Discord in https://steveklabnik.com/writing/using-cratesio-with-buck but the invite there seems to have expired

Whoops! Fixed in https://github.com/steveklabnik/steveklabnik.com/commit/ff43c3d0d9f26ab0b3167da68b1a3ceaf8236a9f

anything happening there?

It's been a bit more quiet lately, but I imagine that invite expiring is at least part of it, ha! There's about 90 people, but it's fairly low traffic.

It looks like it tries to do fixups for scenarios that it understands already, would it be possible to have some fallback thing where the build script is executed in a sandboxed environment to get the last mile done?

You can in fact have it run build scripts. Fixups are for skipping that for scenarios where the build script is only used to replicate some feature that's simpler without running the script. For an example, check out the cxx BUCK file's entry for proc-macro2: https://github.com/dtolnay/cxx/blob/master/third-party/BUCK#L213-L241

fhilgers commented 4 months ago

I tried implementing a simple PoC where i just download archives where each url and hash is specified in a json file. This is possible with dynamic dependencies but only with actions.download_file. I could not use an anonymous http_archive target (which i suppose is the only way to reuse rule implementations in other rules). Reusing just the inner rule implementation is also not possible because one cannot modify the ctx with values received the dynamic_output lambda.

Is there a special reason why anon_targets cannot be created in a dynamic output lambda and why an anon_target cannot call rules that have dynamic outputs? Is one of both ways already planned to be implemented? Is it even possible?