Open howardjohn opened 3 years ago
FWIW I tried to leverage MultiWrite in #238 and evidence suggested it was slower, since it required builds to finish before publishing.
I would believe building all binaries at once is faster, but it means we sit idle on pushing until all builds complete, which might eat all our savings.
If you're interested in hacking on this it could be useful to get an idea about how much / whether it saves real time
Yeah this seems like a great change to make. It's a little bit harder than I'd like because of multi-arch builds and how the code is currently structured, but we might be able to work around that a bit...
The way things are currently laid out, we call Build("import-path")
as soon as we come across something that needs to be built. We have a "caching" builder, so that we don't build anything twice, and we have a "rate limiting" builder to limit the number of concurrent go build
s (helpful if you don't have enough CPU) which corresponds to a -j
flag.
Beyond that, a single Build("import-path")
can actually map to multiple go build
invocations -- one per platform from the base image (and/or --platform
flag). These are unavoidable -- we'll need at least one go build
invocation per platform, even if we can batch them across build targets.
For --platform=all
, we actually don't know ahead of time what platforms we need to build, so this is somewhat tricky. To know all the target platforms, we have to fetch the base image for a given build target from a registry, which depends on a config file. This can take ~seconds, so blocking all builds while we try to figure out all the necessary platforms might actually slow things down a bit.
We'll want to fix https://github.com/google/ko/issues/192 as well...
I'm also a bit worried about streaming use cases -- it may be the case that we are reading yaml from stdin that never "ends" because of something like our own --watch
mode, so we can't just naively batch everything together.
warm builds take ~8s compared to ~23s for ko
My optimizations in #267 and #269 are mostly around no-op builds, which would help for this case, but what you're proposing would help a lot in general. This might be what @dprotaso was getting at in an earlier conversation, and I just didn't understand.
My initial thoughts on a solution for this would look something like:
-j
flag). This debouncer can batch together all the go build
invocations so that we don't shell out quite as much. We'll need to have multiple "batches" of builds per environment, since GOOS
and GOARCH
need to be different for each go build
invocation.remote.MultiWrite
(#236). For some publishers (like tarball), we have to bundle things together already, so they get flushed during Close()
, but we can't reuse that strategy for registry pushing.Build
step faster. This will get us most of the way to fixing https://github.com/google/ko/issues/295 -- and we can make the layer caching stuff configurable, if we want to support offline modes.One final thing I'm worried about is name collisions. For example, if we have to build ./knative/operator
and ./tekton/operator
, I'm not sure how go build -o $TMP
handles that. From some basic testing, it seems to produce a single binary, so this would break if we naively batched all builds like this.
Oh one more thing: from inside ko resolve
or ko apply
, we are doing multiple concurrent builds, which doesn't map exactly to the example you've shown of three separate go build
invocations.
There is definitely a lot of "waste" from having multiple go build
invocations as opposed to having go manage these, but I think this is a closer comparison to what ko
is doing:
# Concurrently
$ time sh -c '
for importpath in ./pilot/cmd/pilot-discovery/ ./operator/cmd/operator/ ./istioctl/cmd/istioctl; do
go build -o /dev/null $importpath &
done;
wait $(jobs -p)
'
real 0m10.079s
user 0m37.146s
sys 0m9.592s
vs.
# Serially
$ time sh -c '
for importpath in ./pilot/cmd/pilot-discovery/ ./operator/cmd/operator/ ./istioctl/cmd/istioctl; do
go build -o /dev/null $importpath
done;
'
real 0m26.051s
user 0m44.153s
sys 0m14.593s
vs.
# Within go build
$ time go build -o /dev/null ./pilot/cmd/pilot-discovery ./operator/cmd/operator ./istioctl/cmd/istioctl
real 0m8.587s
user 0m29.541s
sys 0m5.744s
This issue is stale because it has been open for 90 days with no activity. It will automatically close after 30 more days of inactivity. Keep frash with the 'lifecycle/frozen' label.
/lifecycle frozen
I think there are some gains to be made - though non-obvious how to approach it atm.
Ok - I have no power here lol
Compiling multiple binaries in a single go command is much faster than individually. For example, both incremental builds:
Along with https://github.com/google/ko/issues/236, this would improve the speed of
ko
quite a bit I think.Currently building these 3 binaries in parallel then building and pushing in parallel with
docker buildx bake
, warm builds take ~8s compared to ~23s forko