Support for GitHub Action Cache backend?

thoughtpolice commented 3 months ago

Hi,

I'm a Nix user, which is a distant cousin of tools like Bazel and Buck. I use it for a lot of projects on GitHub, and often one of the most important things I want is caching for builds when using GitHub Actions. Nix caching typically looks something like this:

nix client <---> http proxy <---> s3 storage/file system

The design of the HTTP system is in theory stateless because you only need to transform the input hash into some kind of query key. It's a really simple protocol. However, this requires some kind of out of band storage system; I've used things like Cloudflare R2 for this.

But there's this really nifty project called magic-nix-cache and an associated action that is much more turnkey. When you build something with GHA, the runner is given access to an API to upload artifacts into an expiring cache. Every project gets 10GiB for free. It's basically just a key-value store. magic-nix-cache instead uses that as the backing store for cache artifacts, so it is always available and can be reused between runs. You can see an example of a GHA action cache here, which is using magic-nix-cache.

The -action package basically is a set of steps that wraps all this up: it installs and runs the daemon as part of the workflow, so all you have to do is something like this in your GitHub YAML and you instantly have a full working build with incremental caching:

jobs:
  check:
    runs-on: ubuntu-22.04
    permissions:
      id-token: "write"
      contents: "read"
    steps:
      - uses: actions/checkout@v4
      - uses: DeterminateSystems/nix-installer-action@main
      - uses: DeterminateSystems/magic-nix-cache-action@main
      - run: nix build .

Basically: would it be possible for nativelink to support this workflow? A backend for the CAS and ActionCache interfaces that uses GHA Cache storage? I think this basically requires idempotent storage layer, because there can be many separate instances at once, and a stateless query layer because you want to immediately get a hit after starting the daemon from scratch on a fresh instance. I don't know how nativelink works currently or if this fits into the storage model.

Or is the recommended way to use cache dirs with GHA?

FWIW, the magic-nix-cache source code even includes a gha-cache Rust package for writing to the cache storage https://github.com/DeterminateSystems/magic-nix-cache/tree/main/gha-cache (though it's not published so you'd probably have to vendor it for your Cargo builds.) So you don't have to reinvent that; in particular I think it helps handle retries since the cache API can get rate limited...

Something like this could make it really easy for projects to instantly improve build times, I've been super happy with magic-nix-cache in practice because it's so easy to add caching, and would love it if I could use this with Buck2 on every platform.

aaronmondal commented 3 months ago

@thoughtpolice I love that you brought this up!

We've had discussions about this for a while now and AFAIK general consensus is that we'd like to implement this. A big missing piece was something like the gha-crate that you pointed out. I wasn't aware of this and it makes implementing a GHA cache backend much easier.

It's absolutely possible to implement this in NativeLink as e.g. a GitHubActionsStore or GitHubStore, similar to let's say the S3Store or RedisStore. We'll of course need to build the actual GitHub action logic around it as well, but the effort here seems reasonable.

Notes:

It's probably easier to use the published nativelink container image instead of a raw executable. The images are signed and essentially the exact same as the raw executable at ~28 MB built statically against musl. This way we also don't need to require a Nix installation on default runners.
For MacOS this means that we'd actually need arm64-linux images, since Mac containers run in a Linux VM. We're already working on getting such images ready.
We're heavily using the magic-nix-cache action in NativeLink's CI and it's clear that even with ratelimiting, target-based caching is the way to go.

cc @allada @caass

aleksdmladenovic commented 1 month ago

I will work on this one.

cc: @allada , @MarcusSorealheis

TraceMachina / nativelink

Support for GitHub Action Cache backend? #1066