howardjohn commented 3 years ago

Compiling multiple binaries in a single go command is much faster than individually. For example, both incremental builds:

$ time sh -c '
quote>     go build -o /dev/null ./pilot/cmd/pilot-discovery
quote>     go build -o /dev/null ./operator/cmd/operator
quote>     go build -o /dev/null ./istioctl/cmd/istioctl
quote> '

real    11.060
user    21.847
sys     2.781
$ time go build -o /dev/null ./pilot/cmd/pilot-discovery ./operator/cmd/operator ./istioctl/cmd/istioctl

real    5.031
user    20.895
sys     1.469

Along with https://github.com/google/ko/issues/236, this would improve the speed of ko quite a bit I think.

Currently building these 3 binaries in parallel then building and pushing in parallel with docker buildx bake, warm builds take ~8s compared to ~23s for ko

imjasonh commented 3 years ago

FWIW I tried to leverage MultiWrite in #238 and evidence suggested it was slower, since it required builds to finish before publishing.

I would believe building all binaries at once is faster, but it means we sit idle on pushing until all builds complete, which might eat all our savings.

If you're interested in hacking on this it could be useful to get an idea about how much / whether it saves real time

imjasonh commented 3 years ago

267 and #269 also have some promise to make builds faster, by caching outputs and avoiding unnecessary builds/pushes altogether. That might be another good area to investigate.

jonjohnsonjr commented 3 years ago

Yeah this seems like a great change to make. It's a little bit harder than I'd like because of multi-arch builds and how the code is currently structured, but we might be able to work around that a bit...

The way things are currently laid out, we call Build("import-path") as soon as we come across something that needs to be built. We have a "caching" builder, so that we don't build anything twice, and we have a "rate limiting" builder to limit the number of concurrent go builds (helpful if you don't have enough CPU) which corresponds to a -j flag.

Beyond that, a single Build("import-path") can actually map to multiple go build invocations -- one per platform from the base image (and/or --platform flag). These are unavoidable -- we'll need at least one go build invocation per platform, even if we can batch them across build targets.

For --platform=all, we actually don't know ahead of time what platforms we need to build, so this is somewhat tricky. To know all the target platforms, we have to fetch the base image for a given build target from a registry, which depends on a config file. This can take ~seconds, so blocking all builds while we try to figure out all the necessary platforms might actually slow things down a bit.

We'll want to fix https://github.com/google/ko/issues/192 as well...

I'm also a bit worried about streaming use cases -- it may be the case that we are reading yaml from stdin that never "ends" because of something like our own --watch mode, so we can't just naively batch everything together.

warm builds take ~8s compared to ~23s for ko

My optimizations in #267 and #269 are mostly around no-op builds, which would help for this case, but what you're proposing would help a lot in general. This might be what @dprotaso was getting at in an earlier conversation, and I just didn't understand.

My initial thoughts on a solution for this would look something like:

Create a debouncer within the go builder with a (maybe configurable) timeout and perhaps an upper-limit on the number of builds (in fact, we might get this for free from the -j flag). This debouncer can batch together all the go build invocations so that we don't shell out quite as much. We'll need to have multiple "batches" of builds per environment, since GOOS and GOARCH need to be different for each go build invocation.
Create a similar debouncer within the default publisher (maybe others) that can batch pushes together via remote.MultiWrite (#236). For some publishers (like tarball), we have to bundle things together already, so they get flushed during Close(), but we can't reuse that strategy for registry pushing.
In order to keep the debounce timeout small (to avoid slowing things down), we need to optimize the base image fetching step. We don't want to have to hit the network while we're trying to figure out what platforms need to get built. To fix this, I'd propose we add a registry cache for the base image fetcher. For this bug, we just need manifests stored. Config blobs would help speed things up as well, but we don't need to cache layers to make the Build step faster. This will get us most of the way to fixing https://github.com/google/ko/issues/295 -- and we can make the layer caching stuff configurable, if we want to support offline modes.

One final thing I'm worried about is name collisions. For example, if we have to build ./knative/operator and ./tekton/operator, I'm not sure how go build -o $TMP handles that. From some basic testing, it seems to produce a single binary, so this would break if we naively batched all builds like this.

jonjohnsonjr commented 3 years ago

Oh one more thing: from inside ko resolve or ko apply, we are doing multiple concurrent builds, which doesn't map exactly to the example you've shown of three separate go build invocations.

There is definitely a lot of "waste" from having multiple go build invocations as opposed to having go manage these, but I think this is a closer comparison to what ko is doing:

# Concurrently
$ time sh -c '
for importpath in ./pilot/cmd/pilot-discovery/ ./operator/cmd/operator/ ./istioctl/cmd/istioctl; do
  go build -o /dev/null $importpath &
done;
wait $(jobs -p)
'

real    0m10.079s
user    0m37.146s
sys     0m9.592s

vs.

# Serially
$ time sh -c '
for importpath in ./pilot/cmd/pilot-discovery/ ./operator/cmd/operator/ ./istioctl/cmd/istioctl; do
  go build -o /dev/null $importpath  
done;
'              

real    0m26.051s
user    0m44.153s
sys     0m14.593s

vs.

# Within go build
$ time go build -o /dev/null ./pilot/cmd/pilot-discovery ./operator/cmd/operator ./istioctl/cmd/istioctl

real    0m8.587s
user    0m29.541s
sys     0m5.744s

github-actions[bot] commented 3 years ago

This issue is stale because it has been open for 90 days with no activity. It will automatically close after 30 more days of inactivity. Keep frash with the 'lifecycle/frozen' label.

dprotaso commented 3 years ago

/lifecycle frozen

I think there are some gains to be made - though non-obvious how to approach it atm.

dprotaso commented 3 years ago

Ok - I have no power here lol

ko-build / ko

Build go binaries in parallel #314

267 and #269 also have some promise to make builds faster, by caching outputs and avoiding unnecessary builds/pushes altogether. That might be another good area to investigate.