golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
122.8k stars 17.51k forks source link

cmd/go: remote build cache #42785

Open idcmp opened 3 years ago

idcmp commented 3 years ago

What version of Go are you using (go version)?

$ go version
go version go1.15.3 darwin/amd64

What are you suggesting?

Go's build cache (GOCACHE) provides a hash-driven mechanism for storing intermediate compilation results. This hash takes into consideration a number of different variables (including Go versions, architecture, flags, paths, etc). As such, it should be possible to share the cache across different concurrent build jobs (both for developers and CI jobs).

There are a lot of different ways to share thousands of small files across machines, but for most people this will involve NFS (AWS EFS, GCP Filestore, etc). NFS is suboptimal for this use case (provisioning IO throughput across many machines is problematic, builds should not be punished for NFS outages, file interactions need to be atomic, metadata (stat) lookups are rarely cached, etc, etc).

Instead, this is a proposal to support URLs in GOCACHE. This would allow CI servers to have a simple, central HTTP server which would accept GETs for cache lookups and PUT/POST for cache writes. For distributed systems, local proxies off the central HTTP server would provide an additional speedup.

Why would it be valuable?

As anyone who has had to interact with NFS in their career would agree, this is a much simpler configuration than trying to mount a shared filesystem.

This would also lead to the possibility of using memcache/etc URLs in the future.

What are you looking for?

What is the process to officially propose this idea and have it accepted? If it was accepted, I would like some design constraints. I could take it from there and put together a prototype.

cagedmantis commented 3 years ago

/cc @bcmills @jayconrod @matloob

jayconrod commented 3 years ago

Some prior art: Bazel supports remote caching over both HTTP and gRPC based protocols.

Some of the folks that worked on Bazel RBE prototyped this in the go command last year. We found that this had the potential to speed up long-running tests, since test results can be cached remotely. However, it's detrimental for builds. Most actions performed by the go command are very fast, and the overhead of sending lots of small files over the network tends to dominate any gains we might see.

Implementing remote caching and execution adds a lot of complexity to the go command. Since it would benefit a fairly small number of users, we chose not to go forward with it at that time.

paulgmiller commented 3 years ago

I'd love to get the speed up for caching tests since were we spend alot of time. Is there a way to identify test cache results so we could slurp them up GOCACHE and then lay them back down before the next clean build? (maybe storing/retrieviing at git commit).

Having a seperate GO_TEST_CACHE might also make this pretty feasible.

or-shachar commented 2 years ago

@jayconrod - is there a public reference to the design and the attempt done by the RBE people?

I wonder if we can still utilize remote cache for CI builds done on ephemeral machines. For organizations working in monorepo and cloud-based CI this could be a real game-changer (given that the org does not want to adopt Bazel).

Is there a way to plugin to go cache so we can replace its implementation with some custom code? That would allow us to offer remote caching implementation while keeping the code of go slim and lean.

rbalint commented 1 year ago

I'd also be interested in plugging into Go's caching. https://github.com/firebuild/firebuild implements a generic process cache working with a many compilers and reproducible commands, but not Go, because Go's build insists on using its internal caching. https://github.com/firebuild/firebuild is a local cache that can be fine-tuned to not cache quick operations or filter by command.

ianlancetaylor commented 1 year ago

See the proposal at #59719, which pulls most of the work out of cmd/go.

AlekSi commented 3 weeks ago

See #64876 for the followup