Open ssssam opened 2 years ago
This could certainly be interesting if either no big workers are available in a build farm or if the project could benefit from more fine-grained caching. The idea of using recc with nested remote execution has been around for a while but nobody has attempted to implement it yet, as far as I know.
I think this would mainly be a feature of the REAPI server/worker side. E.g. BuildGrid/BuildBox could support a platform property that allows network access to the CAS and Execution services and sets environment variables to communicate the URIs. It might make sense to extend buildbox-casd to work as an Execution service proxy, making it possible to restrict the sandbox hole to the worker's local buildbox-casd.
BuildStream would simply need to support setting that platform property (possibly in a generic way, see #1313). BuildStream projects could then include recc in their sandbox and configure it using the environment variables set by the worker.
If recc supported buildbox-casd's CaptureFiles
method, it could skip hashing of input files when buildbox-fuse is used as the hash would already available in the user.checksum.sha256
xattr, reducing the recc overhead.
Interested in this, or at least something like it that can do caching and distributed builds within an element.
For context, I am evaluating BuildStream as a meta build system to migrate an embedded Linux OS for a large company, and we currently have a very large codebase for our core application that runs on our devices. The core application is not componentized like other items that build alongside it, such as Busybox or OpenSSL or ffmpeg that only take a minute or so to build from scratch. While the correct solution is to split up that large primary application into their own elements, some portions of it wouldn't be possible to do so, and for the other it would be at least challenging and time-consuming when you have a lot of developers within different departments/groups to coordinate with (bureaucracy, am I right?). When actively developing code within that codebase, BuildStream's current (very excellent otherwise) element-level caching and distribution doesn't help much, so a more fine-grained inner-element caching and distributed build solution would be needed.
Interesting that you opened this issue and I didn't notice.
At ApacheCon @sstriker and I had a bit of a hackfest and came up with #1772 in order to test this out by building a bazel project with remote caching enabled, I will push that sample project up shortly, however it didn't work so far as I was not able to properly configure bazel to use remote caching.
I've uploaded https://gitlab.com/tristanvb/buildstream-recc-demo which proves our concept of having recc access the cas server from within the sandbox (using the #1772 branch), at least when running buildstream locally.
Will do some cleanup, possibly some performance testing, and then blog on it.
And a summary of our experiment can be found on my blog: https://blogs.gnome.org/tvb/2022/10/14/buildstream-at-apachecon-2022-new-orleans/
@gtristan can you please submit proposal for actual integration that allows recc to run? freedesktop-sdk 24.08 and later will provide recc element.
BuildStream is a tool to execute full "package" builds, including awkward parts like
configure
scripts, stripping, integration commands, in a repeatable way using a sandbox. It can distribute these builds to build farm workers via the REAPI, at a granularity of one element => one worker. This means, that a large element like WebKit is only compiled on a single worker which can be slow.Recc is a tool to distribute individual C/C++ compile tasks across build farm workers. At present, it cannot be used inside a BuildStream sandbox because it has no way to access the REAPI endpoint, but ... could we allow a controlled opening in the sandbox to enable Recc to be used inside element builds?
The ideal scenario would be a large element building as follows:
bst build
runs the elementconfigure-commands
on a single worker, withrecc
replacingcc
etc.bst build
runs the elementbuild-commands
: a. For each C/C++ compile operation,recc
hashes the inputs for the given compile operation to produce a cache key, and checks the remote cache via configured REAPI endpoint b. If needed, compile is invoked on a remote worker. Then, the object file is downloaded from the cache.This is not guaranteed to be faster than building on a single node, a lot depends on the infrastructure being used and the element being built. I think it would be worth exploring as part of a larger project to try and reduce build times of large BuildStream projects like gnome-build-meta.