Open negz opened 1 month ago
Something that didn't occur to me until today - do we want to back port this change to our currently maintained release branches? If not, we'll be working with two build systems.
Perhaps that's okay while we run the experiment? We could update the release branches if and when we commit to Earthly.
I've seen https://github.com/earthly/earthly/issues/3736 once or twice today (but not before).
It seems like we can run with --disable-remote-registry-proxy
to workaround this if it happens a lot. It's unclear to me what the downside of running without the remote registry proxy is.
Here's an example of a PR with a full remote registry cache hit.
You can see that the lint, check-diff and unit-test jobs all complete in less than a minute. This is because none of the build inputs changed between the latest master build and this PR. Earthly knows there's no point re-running them, so it skips them just like a docker build
would with unchanged inputs.
On the other hand, the codeql, e2e, and publish-artifacts jobs don't benefit much in this run.
I'm pretty confident the publish-artifacts job doesn't benefit from cache because one of its inputs is CROSSPLANE_VERSION
, which includes the current git commit. So any new git commit means the build inputs change.
~It's not clear to me why the codeql and e2e test jobs run, though. It looks like the build step of the e2e tests is cached, but not the step that actually runs the tests. I would have expected them both to be fully cached.~
Edit: I figured out why CodeQL and E2E tests weren't being cached. It was also due to CROSSPLANE_VERSION
. See https://github.com/crossplane/crossplane/pull/5755.
Just realized we need to teach Renovate about earthly
- https://github.com/crossplane/crossplane/pull/5753#issuecomment-2136434340
The E2E tests should be writing e2e-tests.xml
for Buildpulse when they fail, but they don't.
https://github.com/crossplane/crossplane/blob/ed0fb98cc8d19a96229caf460f2b0950d583438f/Earthfile#L62
I think this error has something to do with it:
unlazy force execution: process "/bin/sh -c EARTHLY_DOCKERD_DATA_ROOT=\"/var/earthly/dind/944cc3b666f2b62bc87a5159517f6f8d1619489218402a5931707a886c2d7dea\" EARTHLY_DOCKERD_CACHE_DATA=\"false\" EARTHLY_DOCKER_LOAD_FILES=\"\" EARTHLY_IMAGES_WITH_DIGESTS=\"crossplane-e2e/crossplane:latest@sha256:2427e9fcd06f0f5efde8c7157db825d08402e2239ce69d457272f9c96f3bd182\" EARTHLY_START_COMPOSE=\"false\" EARTHLY_COMPOSE_FILES=\"\" EARTHLY_COMPOSE_SERVICES=\"\" FLAGS='-test.failfast -fail-fast --test-suite ssa-claims' GOLANG_VERSION=1.22.3 GOPATH=/go GOTOOLCHAIN=local GO_VERSION=1.22.3 PATH=/go/bin:/usr/local/go/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin /var/earthly/dockerd-wrapper.sh execute /usr/bin/earth_debugger /bin/sh -c 'gotestsum --no-color=false --format testname --junitfile e2e-tests.xml --raw-command go tool test2json -t -p E2E ./e2e -test.v ${FLAGS}'" did not complete successfully: exit code: 1
I notice this happens every time an E2E test fails, but not when other things (e.g. earthly +lint
) fail. I imagine it's something to do with WITH DOCKER
.
Edit: Opened https://github.com/earthly/earthly/issues/4173 for clarification.
If we could somehow inject this version after the binary was built it would help a lot with build caching.
No idea how to do that though. Maybe patch the binary? For the OCI image we could just set an env var or inject a file, but that doesn't help with the binaries, especially crank
which is run outside a container.
If we could somehow inject this version after the binary was built it would help a lot with build caching.
Not quite what you are asking for, but you can move the arg down below the copy and it should save you redoing the COPY when the version changes but not the files. That may never happen, but the general principle of pushing the args down to right before they are used might be helpful in general.
@adamgordonbell Thanks! Unfortunately the bulk of the time is definitely spent in the RUN go build
step, since we compile every PR for all supported platforms. Since we want to compile the version into the binaries, and we want to derive the version from the git commit it seems like a rebuild when the git commit changes is probably unavoidable.
@jbw976 I do wonder whether we really need to compile for every architecture on every PR, though. I presume it's only going to catch fairly rare cases where some change affects only certain platforms.
That said, the publish artifacts job is only really going to hold up the build in cases where nothing has changed and every other job (except fuzz testing 😒) is a no-op. For any PR that actually touches Go code the E2E tests are going to take longer. So maybe not worth optimizing for.
I've seen https://github.com/actions/runner-images/issues/7897 maybe 5-6 times since we made the switch yesterday. Anecdotally mostly on E2E tests, but also once on codeql and once or twice on publish-artifacts. It seems like it might be a symptom of using too many compute resources for the runner. Unclear whether we're computing harder (more parallelism?) with the switch to Earthly, or if it's just coincidental timing and something's going on with GitHub Actions.
Renovate can't run
earthly
I setup our Renovate GitHub Action to use https://github.com/earthly/actions-setup, but noticed Renovate was complaining it couldn't run earthly
. Turns out our Renovate action is running renovate
inside a Docker container.
It's also not going to find an Earthfile
on the release branches. 🙃
As a feedback, I wanted to note that Earthly had once switched from the original MPL-2.0 to BSL, before reversing course (see license history). The design doc mentions the latter, but not the former. I wanted to let everyone know, in case it might affect our decision.
@mergenci Thanks! Yes, that was noted during the proposal: https://github.com/crossplane/crossplane/blob/93610dc7e7877f/design/one-pager-build-with-earthly.md#risks
Not an issue, just surprising: https://docs.earthly.dev/docs/earthfile#options-4 - the _output
binary crank
and crossplane
gets timestamp Apr 16 2020
@chlunde Thanks! I'm not opposed to setting that flag. I wonder if there are any downsides?
Was just looking into installing latest from master
for https://github.com/crossplane/crossplane/issues/5151#issuecomment-2140753505, and made a couple observations @negz - let me know if you'd like me to dig into any of them!
master
channel with helm in https://docs.crossplane.io/latest/software/install/#install-pre-release-crossplane-versions are failing with the following error:
❯ helm install crossplane \
--namespace crossplane-system \
--create-namespace crossplane-master/crossplane \
--devel --debug
install.go:178: [debug] Original chart version: "" install.go:180: [debug] setting version to >0.0.0-0 Error: INSTALLATION FAILED: failed to fetch https://charts.crossplane.io/crossplane-1.17.0-rc.0.175.g93610dc7.tgz : 404 Not Found helm.go:84: [debug] failed to fetch https://charts.crossplane.io/crossplane-1.17.0-rc.0.175.g93610dc7.tgz : 404 Not Found
* https://marketplace.upbound.io/account/crossplane/crossplane has what looks to be PR builds being published to that repo. Not sure if that is new behavior, but in my mind at least I thought we were only publishing builds from `master` to there.
* e.g. `v1.17.0-rc.0.177.g8122f120` corresponds to https://github.com/crossplane/crossplane/commit/8122f120, which says "This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository."
Thanks @jbw976!
instructions to install from master channel with helm in https://docs.crossplane.io/latest/software/install/#install-pre-release-crossplane-versions are failing
https://charts.crossplane.io/crossplane-1.17.0-rc.0.175.g93610dc7.tgz looks like an incorrect URL, right? I think it should be https://charts.crossplane.io/master/crossplane-1.17.0-rc.0.175.g93610dc7.tgz.
I'm guessing something is wrong with this target:
Maybe the --url
flag for helm repo index
should be ${HELM_REPO_URL}/${CHANNEL}
?
https://marketplace.upbound.io/account/crossplane/crossplane has what looks to be PR builds being published to that repo.
Yeah it's definitely not supposed to be publishing PR builds there, but it seems like it is - e.g. https://github.com/crossplane/crossplane/actions/runs/9299206909/job/25592661840?pr=5764. The "Configure Earthly to Push Artifacts" step should be skipped for PRs. Technically I gated it on the required credentials existing, since I thought no PRs had access to those. I guess some do, though. 🤔
https://github.com/crossplane/crossplane/pull/5765 should fix the Helm issue.
$ helm repo update
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "crossplane-stable" chart repository
...Successfully got an update from the "crossplane-master" chart repository
...Successfully got an update from the "bitnami" chart repository
Update Complete. ⎈Happy Helming!⎈
$ helm install crossplane \
--namespace crossplane-system \
--create-namespace crossplane-master/crossplane \
--devel
NAME: crossplane
LAST DEPLOYED: Thu May 30 18:29:20 2024
NAMESPACE: crossplane-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
Release: crossplane
Chart Name: crossplane
Chart Description: Crossplane is an open source Kubernetes add-on that enables platform teams to assemble infrastructure from multiple vendors, and expose higher level self-service APIs for application teams to consume.
Chart Version: 1.17.0-rc.0.195.g11333cf6
Chart Application Version: 1.17.0-rc.0.195.g11333cf6
Kube Version: v1.27.3
One small downside. Because we're building in a container, and because Earthly (and Docker) won't let you COPY
from a parent directory, you can't use local replace
statements very easily in go.mod
.
For example when developing outside of a container you might use:
replace sig.k8s.io/e2e-framework => ../../kubernetes-sigs/e2e-framework
In theory you could COPY
that directory into e.g. vendor/sig.k8s.io/e2e-framework
and replace
to that, but you can't because you can't copy from above the Earthfile
directory.
As mentioned in https://github.com/crossplane/crossplane/pull/5779#issuecomment-2151200847, I was surprised that a plain earthly +build
did not output the crossplane
/crank
binaries to the _output
directory.
The _output
dir is populated with these binaries if earthly +multiplatform-build
or earthly +go-build
is run, but not with a plain earthly +build
.
A common workflow I've used in the past when working on the Crossplane CLI was just plain make build
and then running the _output/<platform>/crank
binary directly to test the changes. It would be nice if earthly +build
also saved these binary artifacts to the _output
directory just like make build
used to.
As mentioned in #5779 (comment), I was surprised that a plain
earthly +build
did not output thecrossplane
/crank
binaries to the_output
directory.The
_output
dir is populated with these binaries ifearthly +multiplatform-build
orearthly +go-build
is run, but not with a plainearthly +build
.A common workflow I've used in the past when working on the Crossplane CLI was just plain
make build
and then running the_output/<platform>/crank
binary directly to test the changes. It would be nice ifearthly +build
also saved these binary artifacts to the_output
directory just likemake build
used to.
Here is the PR to address this https://github.com/crossplane/crossplane/pull/5782
I observed something today while testing #5786 that I don't understand yet - so I'm not sure it's a problem but was wondering what others thought.
Basically, the crossplane image digest changes across builds if a earthly prune
is run. I'm wondering if that also means that two builds for an identical source commit hash/tag could also result in different image digests.
See the Image digest
section of https://github.com/crossplane/crossplane/pull/5786#pullrequestreview-2124133819 for more details.
I also wonder if we're seeing https://github.com/earthly/earthly/issues/2823 and it is affecting us too:
❯ docker images --digests build-69fd693a/crossplane
REPOSITORY TAG DIGEST IMAGE ID CREATED SIZE
build-69fd693a/crossplane master <none> 898d135985cb 14 minutes ago 77.4MB
build-69fd693a/crossplane master_linux_arm64 <none> 898d135985cb 14 minutes ago 77.4MB
build-69fd693a/crossplane v0.0.0-1718156806-69fd693a <none> 898d135985cb 14 minutes ago 77.4MB
build-69fd693a/crossplane v0.0.0-1718156806-69fd693a_linux_arm64 <none> 898d135985cb 14 minutes ago 77.4MB
I also wonder if we're seeing earthly/earthly#2823 and it is affecting us too
2823 seems to be fixed by 2853 (probably not closed yet by accident). I believe the issue that we are seeing with the missing digests is https://github.com/earthly/earthly/issues/2851 which is still open
@jbw976 I tried the release-1.15
branch today and actually it has the same behavior: no digest is shown, plus the image ID changes after every build (after cleaning up the cache dirs):
➜ make build
...
11:31:43 [ OK ] docker build build-d52750be/crossplane-amd64
➜ docker images --digests build-d52750be/crossplane-amd64
REPOSITORY TAG DIGEST IMAGE ID CREATED SIZE
build-d52750be/crossplane-amd64 latest <none> 946742876039 34 seconds ago 54.4MB
➜ rm -rf .cache .work _output
➜ docker rmi 946742876039
➜ docker system prune
➜ make build
...
11:35:10 [ OK ] docker build build-d52750be/crossplane-amd6
➜ docker images --digests build-d52750be/crossplane-amd64
REPOSITORY TAG DIGEST IMAGE ID CREATED SIZE
build-d52750be/crossplane-amd64 latest <none> e7eb5731e4e0 3 minutes ago 54.4MB
I understand that this is not ideal, but it doesn't seem like a regression of earthly
What happened?
In https://github.com/crossplane/crossplane/pull/5711 we decided to try using https://earthly.dev for a release (i.e. until we ship v1.17).
I expect we'll probably run into issues immediately after cutting over the build system to a new tool, so I'm opening this to track any bugs we find in the new build setup that need to be fixed. General non-bug feedback is also welcome. 😄