moonrepo / moon

A build system and monorepo management tool for the web ecosystem, written in Rust.
https://moonrepo.dev/moon
MIT License
2.85k stars 155 forks source link

[feature] Support Remote Execution API for caching #1520

Open rhuanbarreto opened 3 months ago

rhuanbarreto commented 3 months ago

Is your feature request related to a problem? Please describe.

Although moonbase has a caching service, for regulatory reasons we cannot store cached artifacts outside our own domain.

Many other monorepo tools like bazel, pants and rush enables the usage of your own storage backend for caching artifacts.

On the other hand, caching the .moon/cache folder in github actions doesn't help much either once the size limits of github are too low.

Describe the solution you'd like

I would like to have a config so I can self host my own cached artifacts in Azure Blob Storage for example. If this includes running a container separately for the service like https://github.com/buchgr/bazel-remote it's fine.

Describe alternatives you've considered

For now using moonbase is actually hard as it creates a dependency a service outside our domain. So only alternative is using Github / Azure DevOps pipeline caching.

milesj commented 3 months ago

I've been working on making moonbase self-hostable, but while doing so, I've had thoughts of just reworking it into a generic remote caching server. I keep going back and forth on which approach would be better. Either way, it's a lot for me to maintain at the moment.

rhuanbarreto commented 3 months ago

No rush at all! Very important to have but also not the top priority right now.

One suggestion to cut some corners that don't need to be developed: You can leverage bazel-remote right away and avoid building the same abstraction again. Leverage it so you don't need to build something that almost became an industry standard. This will also put a big plus on the monorepo.tools website for moonrepo.

So the REAPI is a gRPC Protobuf implementation where the bazel-remote responds with the cache parts in a streaming way, which saves lots of back and forth.

One implementation in rust is done by Pants in this file: https://github.com/pantsbuild/pants/blob/main/src/rust/engine/process_execution/src/cache.rs

Hope you can find a way! It would be very beneficial to all the community.

milesj commented 3 months ago

Yeah agreed, I've also thought about piggy backing off of bazel's APIs. Might as well.

dudicoco commented 1 month ago

Maybe the remote cache can be implemented on the client side instead of having to use a server? That way the client can directly read/write the cache from blob storage.

rhuanbarreto commented 1 month ago

By using bazel-remote we do this. But moon must support this as the source for finding the cache hits and hydrating the state.

milesj commented 1 month ago

I've briefly looked into this, and I will be moving to bazel's APIs, since they also offer action caching which I'll need in the future. Just need to find the time to integrate it. If anyone else wants to tackle it, let me know.

dudicoco commented 1 month ago

By using bazel-remote we do this. But moon must support this as the source for finding the cache hits and hydrating the state.

Can you elaborate? Doesn't bazel-remote require a server?

rhuanbarreto commented 1 month ago

Yes. We run a bazel-remote container backed by azure blob storage. We connect to bazel-remote using mTLS connection. We use this today with Pants. If moon could support the same, we don't need to have many different places for managing this cache.

dudicoco commented 1 month ago

@rhuanbarreto I still don't understand your point.

My suggestion was to have the client make direct API calls to the blob storage (S3 etc.) instead of communicating with a server which has to be deployed and maintained. In addition a server would require another authentication and authorization mechanism for the clients, which you would get out of the box with IAM permissions for a client based solution.

So I still don't see the advantage of having a server based solution which adds extra complexity and overhead.

milesj commented 2 weeks ago

Good news, a new rust crate recently popped up that does a lot of the heavy lifting for the bazel remote APIs. https://github.com/amkartashov/bazel-remote-apis-rust

Will give this a shot for the next release.

rhuanbarreto commented 2 weeks ago

OMG! Great news! If you need an alpha tester, you know where to find me.

One small request: Make sure moon can support mTLS connections. htppasswd is too unsafe.