jacksmith15 / bazel-python-demo

A modern Python development set-up for Bazel.
6 stars 1 forks source link

Remote Caching Setup #3

Open jacksmith15 opened 2 years ago

jacksmith15 commented 2 years ago

Remote Caching would be useful, even if only enabled in CI.

Bazel Remote Caching involves simple GET and PUT requests to upload and download file blobs, with some standard path prefixes.

Headers can be configured for Auth: https://bazel.build/reference/command-line-reference#flag--remote_header

S3 REST API is likely a good fit for this. The CI worker can be an AWS user permitted to read/write from the bucket, and can fetch a temporary security token at the start of the job:

https://docs.aws.amazon.com/AmazonS3/latest/userguide/RESTAuthentication.html#UsingTemporarySecurityCredentials

This token can then be passed via the header command line argument as above.

Developers could also use this approach, but will need to intermittently reissue their tokens.

What are the advantages over using the CircleCI cache?

The cache includes every input/output that has run, and so saving and loading it in CircleCI will get progressively slower. Bazel's integrated remote Caching will only fetch and store the relevant parts of the cache, making it much faster.

Since it doesn't need to be namespaced on e.g. branch, this means builds on main should always hit the cache, avoiding the "build runs twice" inefficiency.

jacksmith15 commented 2 years ago

Attempts at using a CircleCI cache are shown here: https://github.com/jacksmith15/bazel-python-demo/pull/2

Unfortunately this results in obscene storage usage, meaning its not really a viable option.

jacksmith15 commented 2 years ago

Another S3-based cache solution can be seen here: https://github.com/Asana/bazels3cache

This acts as a proxy server around S3, and implements a few additional features that sound useful:

jacksmith15 commented 1 year ago

See https://github.com/bazelbuild/proposals/blob/main/designs/2022-06-07-bazel-credential-helpers.md

This might be useful for implementing authentication for the remote cache. It is currently available as an experimental feature.