pantsbuild / pants

The Pants Build System
https://www.pantsbuild.org
Apache License 2.0
3.33k stars 636 forks source link

Add an HTTP-only remote cache, similar to v1 #11149

Open stuhood opened 4 years ago

Eric-Arellano commented 3 years ago

Is this technically feasible, given the nature of v2 caching and the Remote Execution API?

stuhood commented 3 years ago

It is possible. But this cache would not be compatible with remote execution, which would make it very much a split focus.

I think that if someone wanted to implement this, we would be more than happy to accept the patch. But I don't think that any of the current maintainers are likely to implement it themselves, since we're fairly committed to the REAPI.

huonw commented 2 years ago

If this is referring to something like GET/PUT requests to a cloud blob store, we'd appreciate it, as 'running' an S3 bucket is easy and limited blast radius, while a custom server brings extra risk and likely has an ongoing tail of devops work. I understand that this isn't a priority, but I thought it still might be useful to provide an additional perspective, and if I had pointers about where to start, I'd potentially consider attempting an implementation.

In particular, as soon as we're having to run a server as a critical piece of our software supply chain/build infrastructure, it's something we have to administer, monitor and keep up to date (as well as potentially needing to juggle and secure additional sets of auth credentials) and, more fundamentally, trust. Whereas an S3 bucket (or equivalent in GCS or B2 or ...) doesn't require us to be running any potentially-exploitable custom code, and typically a team will already have to be managing AWS credentials (or GCP or ...) securely for access to other services.

I note that various other similar tools have 'simple' remote caching (with various subsets of particular cloud providers supported) in addition to varying degrees of remote execution:

tool docs blob stores supported remote execution?
Bazel https://bazel.build/docs/remote-caching#cloud-storage GCS yes
Rush https://rushjs.io/pages/maintainer/build_cache/#enabling-cloud-storage Azure, S3 via BuildXL
Lage https://microsoft.github.io/lage/docs/Guide/remote-cache/ Azure via BuildXL
Gradle https://docs.gradle.org/current/userguide/build_cache.html plugins: S3, GCS, ... partially?
Nx* https://nx.app/docs/distributed-caching Nx Cloud yes
Turborepo* https://turborepo.org/docs/core-concepts/remote-caching Vercel no?

(Nx and Turborepo seem to only support connecting to a service associated with the tool, but this has a DX/devops/trust profile closer to a generic cloud blob storage than running a REAPI server manually, although not quite the same.)

ericlacher commented 1 year ago

@huonw I totally agree with your statement above and just realized, you already started on preparatory changes. Just wanted to share my appreciation for the efforts!

@michaloleszak FYI

Xuanwo commented 1 year ago

I note that various other similar tools have 'simple' remote caching (with various subsets of particular cloud providers supported) in addition to varying degrees of remote execution:

Nice post! Here are some notes from the OpenDAL side that may be worth considering as input for you.

Apart from all those services' own native cache services, storage services like s3, azblob, gcs are all already supported by OpenDAL. Please let us know if you want more :laughing: