Open stuhood opened 4 years ago
It is possible. But this cache would not be compatible with remote execution, which would make it very much a split focus.
I think that if someone wanted to implement this, we would be more than happy to accept the patch. But I don't think that any of the current maintainers are likely to implement it themselves, since we're fairly committed to the REAPI.
If this is referring to something like GET
/PUT
requests to a cloud blob store, we'd appreciate it, as 'running' an S3 bucket is easy and limited blast radius, while a custom server brings extra risk and likely has an ongoing tail of devops work. I understand that this isn't a priority, but I thought it still might be useful to provide an additional perspective, and if I had pointers about where to start, I'd potentially consider attempting an implementation.
In particular, as soon as we're having to run a server as a critical piece of our software supply chain/build infrastructure, it's something we have to administer, monitor and keep up to date (as well as potentially needing to juggle and secure additional sets of auth credentials) and, more fundamentally, trust. Whereas an S3 bucket (or equivalent in GCS or B2 or ...) doesn't require us to be running any potentially-exploitable custom code, and typically a team will already have to be managing AWS credentials (or GCP or ...) securely for access to other services.
I note that various other similar tools have 'simple' remote caching (with various subsets of particular cloud providers supported) in addition to varying degrees of remote execution:
tool | docs | blob stores supported | remote execution? |
---|---|---|---|
Bazel | https://bazel.build/docs/remote-caching#cloud-storage | GCS | yes |
Rush | https://rushjs.io/pages/maintainer/build_cache/#enabling-cloud-storage | Azure, S3 | via BuildXL |
Lage | https://microsoft.github.io/lage/docs/Guide/remote-cache/ | Azure | via BuildXL |
Gradle | https://docs.gradle.org/current/userguide/build_cache.html | plugins: S3, GCS, ... | partially? |
Nx* | https://nx.app/docs/distributed-caching | Nx Cloud | yes |
Turborepo* | https://turborepo.org/docs/core-concepts/remote-caching | Vercel | no? |
(Nx and Turborepo seem to only support connecting to a service associated with the tool, but this has a DX/devops/trust profile closer to a generic cloud blob storage than running a REAPI server manually, although not quite the same.)
@huonw I totally agree with your statement above and just realized, you already started on preparatory changes. Just wanted to share my appreciation for the efforts!
@michaloleszak FYI
I note that various other similar tools have 'simple' remote caching (with various subsets of particular cloud providers supported) in addition to varying degrees of remote execution:
Nice post! Here are some notes from the OpenDAL side that may be worth considering as input for you.
Apart from all those services' own native cache services, storage services like s3, azblob, gcs are all already supported by OpenDAL. Please let us know if you want more :laughing:
Is this technically feasible, given the nature of v2 caching and the Remote Execution API?