Closed ameukam closed 3 months ago
/assign @michelle192837 /sig testing
ci-test-infra-update-slack-oncall
no point migrating this, we'll just shut it down when prow is migrated and instead people can posted in #testing-ops in slack.
we should actually probably proactively stop advertising @test-infra-oncall to the broader project.
post-test-infra-upload-testgrid-config
.... uhhhh this one I'm not sure, because we have to be able to publish to testgrid's config bucket .... migrating testgrid is another fun topic
The image publishing jobs we should be able to move over.
re: ci-test-infra-update-slack-oncall: Ah, that's easier then.
re: post-test-infra-upload-testgrid-config: I think this should be doable. I have not gone through the full details, but imo thanks to config merger merging configs for TestGrid from multiple locations, we can stand up a new config upload job in community-owned infra, verify the uploaded config in the new location is the same as the old, and swap the config location used in the TestGrid instance overall.
On the K8s infra side we're going to need a bucket for this to start then, cc @upodroid @ameukam for thoughts.
post-test-infra-push-git post-test-infra-push-git-custom-k8s-auth
Not sure how these didn't wind up getting migrated yet ... looks like this is part of k8s-testimages https://github.com/kubernetes/k8s.io/issues/1523
I don't see evidence that we're actually using these images in Kubernetes and we should probably just delete them.
Prow has built in known-hosts handlinmg in clonerefs these days, I don't think we need these anymore.
Sorry for the delay, I'm looking into this and some of the other unmigrated jobs today.
in #32808 the list should be clearer now, a lot of these are related to running prow so that's fine, but some are pushing images and that's concerning, we should either eliminate or migrate them.
File Path | Job | Link |
---|---|---|
config/jobs/kubernetes/test-infra/test-infra-periodics.yaml | job-migration-todo-report | Search Results |
config/jobs/kubernetes/test-infra/test-infra-trusted.yaml | ci-test-infra-autobump-prow-for-auto-deploy | Search Results |
config/jobs/kubernetes/test-infra/test-infra-trusted.yaml | ci-test-infra-autobump-prow | Search Results |
config/jobs/kubernetes/test-infra/test-infra-trusted.yaml | ci-test-infra-update-slack-oncall | Search Results |
config/jobs/kubernetes/test-infra/test-infra-trusted.yaml | ci-test-infra-branchprotector | Search Results |
config/jobs/kubernetes/test-infra/test-infra-trusted.yaml | ci-test-infra-label-sync | Search Results |
config/jobs/kubernetes/test-infra/test-infra-trusted.yaml | ci-test-infra-gencred-refresh-kubeconfig | Search Results |
config/jobs/kubernetes/test-infra/test-infra-trusted.yaml | ci-test-infra-rotate-legacy-default-build-sa-json-key | Search Results |
config/jobs/kubernetes/test-infra/test-infra-trusted.yaml | post-test-infra-push-alpine | Search Results |
config/jobs/kubernetes/test-infra/test-infra-trusted.yaml | post-test-infra-push-gcloud-terraform | Search Results |
config/jobs/kubernetes/test-infra/test-infra-trusted.yaml | post-test-infra-push-git | Search Results |
config/jobs/kubernetes/test-infra/test-infra-trusted.yaml | post-test-infra-push-git-custom-k8s-auth | Search Results |
config/jobs/kubernetes/test-infra/test-infra-trusted.yaml | post-test-infra-deploy-prow | Search Results |
config/jobs/kubernetes/test-infra/test-infra-trusted.yaml | post-test-infra-reconcile-hmacs | Search Results |
config/jobs/kubernetes/test-infra/test-infra-trusted.yaml | post-test-infra-push-misc-images | Search Results |
config/jobs/kubernetes/test-infra/test-infra-trusted.yaml | post-test-infra-push-kettle | Search Results |
config/jobs/kubernetes/test-infra/test-infra-trusted.yaml | post-test-infra-push-bazel | Search Results |
config/jobs/kubernetes/test-infra/test-infra-trusted.yaml | post-test-infra-push-gcb-docker-gcloud | Search Results |
config/jobs/kubernetes/test-infra/test-infra-trusted.yaml | post-test-infra-push-test-gubernator | Search Results |
config/jobs/kubernetes/test-infra/test-infra-trusted.yaml | post-test-infra-push-gencred | Search Results |
config/jobs/kubernetes/test-infra/test-infra-trusted.yaml | post-test-infra-gencred-refresh-kubeconfig | Search Results |
config/jobs/kubernetes/test-infra/test-infra-trusted.yaml | post-test-infra-upload-oncall | Search Results |
config/jobs/kubernetes/test-infra/test-infra-trusted.yaml | post-test-infra-upload-testgrid-config | Search Results |
config/jobs/kubernetes/test-infra/test-infra-trusted.yaml | post-test-infra-upload-boskos-config | Search Results |
config/jobs/kubernetes/test-infra/test-infra-trusted.yaml | post-test-infra-cip-prow | Search Results |
SIG Contribex:
File Path | Job | Link |
---|---|---|
config/jobs/kubernetes/test-infra/test-infra-trusted.yaml | post-community-tempelis-apply | Search Results |
Not trusted cluster, but the other non-migrated jobs with test-infra in the name (there could be more) ...
File Path | Job | Link |
---|---|---|
config/jobs/kubernetes/test-infra/janitors.yaml | maintenance-pull-janitor | Search Results |
config/jobs/kubernetes/test-infra/janitors.yaml | maintenance-ci-aws-janitor | Search Results |
config/jobs/kubernetes/test-infra/janitors.yaml | maintenance-ci-janitor | Search Results |
Janitor jobs: won't be migrated, will be turned down.
post-test-infra-upload-oncall
, ci-test-infra-update-slack-oncall
: no need, this will be obsolete.
job-migration-todo-report
: will be obsolete, also this isn't working correctly and we're just manually checking in the tool output, I'll clean this one up.
ci-test-infra-rotate-legacy-default-build-sa-json-key
: will be obsolete
post-test-infra-upload-boskos-config
: will be obsolete, we have a different boskos config in github.com/kubernetes/k8s.io for community boskos resources
post-test-infra-cip-prow
: I deleted this in #32812
post-test-infra-push.*
are concerning.
post-test-infra-upload-testgrid-config
will need migrating
I'm guessing renconcile hmacs needs to be considered as part of control plane migration, along with definitely branchprotector.
https://github.com/kubernetes/test-infra/pull/32814 will remove the job-migration-todo-report
report job.
ci-test-infra-label-sync
should be able to migrate to k8s-infra-prow-build-trusted without waiting for the rest of prow, but we might not have the right secrets available yet.
On the K8s infra side we're going to need a bucket for this to start then, cc @upodroid @ameukam for thoughts.
post-test-infra-push-git post-test-infra-push-git-custom-k8s-auth
Not sure how these didn't wind up getting migrated yet ... looks like this is part of k8s-testimages kubernetes/k8s.io#1523
I don't see evidence that we're actually using these images in Kubernetes and we should probably just delete them.
Prow has built in known-hosts handlinmg in clonerefs these days, I don't think we need these anymore.
These are used as the base images for building Prow images (https://cs.k8s.io/?q=gcr.io%2Fk8s-prow%2Fgit&i=nope&files=&excludeFiles=&repos=). I think we can replace the git image with alpine, but git-custom-k8s-auth
might need to stay?
Job | Link | Uses |
---|---|---|
post-test-infra-push-alpine | Search Results | Search Results |
post-test-infra-push-gcloud-terraform | Search Results | Search Results |
post-test-infra-push-git | Search Results | Search Results |
post-test-infra-push-git-custom-k8s-auth | Search Results | Search Results |
post-test-infra-push-misc-images | Search Results | Search Results |
post-test-infra-push-kettle | Search Results | Search Results |
post-test-infra-push-bazel | Search Results | Search Results |
post-test-infra-push-gcb-docker-gcloud | Search Results | Search Results |
post-test-infra-push-test-gubernator | Search Results | Search Results |
post-test-infra-push-gencred | Search Results | Search Results |
Several of these push images that aren't used and should be turned down (post-test-infra-push-test-gubernator, post-test-infra-push-bazel, post-test-infra-push-gcloud-terraform, post-test-infra-push-gencred).
post-test-infra-push-gencred
hasn't succeeded, and pushed to k8s-testimages, which is not what jobs are using; jobs use the image pushed to k8s-prow and pushed by post-test-infra-push-misc-images)Discussed offline: for post-test-infra-push-git
and post-test-infra-push-git-custom-k8s-auth
, since we'll need to migrate the latter anyways, we can migrate the former at the same time, then see if we can replace the git image base with alpine instead.
then see if we can replace the git image base with alpine instead.
we should probably use something else, we generally prefer to use e.g. debian/distroless for kubernetes base images, for licensing reasons (alpine/busybox) and alignment on patching etc.
Sorry for the late response. I can confirm that git-custom-k8s-auth
is used by prow to authenticate to non-GKE clusters (currently it's only EKS)
https://github.com/kubernetes-sigs/prow/blob/main/.ko.yaml
+1 for building a unified base image for prow that has git, the kubectl auth plugins for our cloud vendors
We can migrate that job to the community cluster and update the .ko.yaml references
We can do something similar to the distroless-iptables image in k/release.
tempelis will be done after #32946
https://github.com/kubernetes/test-infra/pull/32948 will do label sync
File Path | Job | Link | Uses |
---|---|---|---|
config/jobs/kubernetes/test-infra/test-infra-trusted.yaml | post-test-infra-push-alpine | Search Results | Search Results |
config/jobs/kubernetes/test-infra/test-infra-trusted.yaml | post-test-infra-push-gcb-docker-gcloud | Search Results | Search Results |
config/jobs/kubernetes/test-infra/test-infra-trusted.yaml | post-test-infra-push-git | Search Results | Search Results |
config/jobs/kubernetes/test-infra/test-infra-trusted.yaml | post-test-infra-push-git-custom-k8s-auth | Search Results | Search Results |
config/jobs/kubernetes/test-infra/test-infra-trusted.yaml | post-test-infra-push-kettle | Search Results | Search Results |
config/jobs/kubernetes/test-infra/test-infra-trusted.yaml | post-test-infra-push-misc-images | Search Results | Search Results |
config/jobs/kubernetes/test-infra/test-infra-trusted.yaml | post-test-infra-upload-testgrid-config | Search Results |
With the linked PRs, we should have a canary job for all these jobs. Once these are submitted and we have new images for all of them, I'll switch the relevant uses to use the k8s-staging-test-infra images instead, and turn down the old image pushing jobs.
(The TestGrid config switch is a bit more involved but not much more. I just need to swap what config is referenced in the mergelists after verifying the new is the same as the old, and that config merger has permissions to read from the new bucket. I'll look into that now.)
After today's SIG meeting I eliminated the oncall update jobs (slack, GCS) #33083 #33084
We should probably pre-emptively migrate ci-test-infra-branchprotector
to the new trusted cluster.
migrating branch protector looks straightforward, will send a PR in a little bit.
https://github.com/kubernetes/test-infra/pull/33098 takes care of the branch protector.
That leaves:
So when we move prow we'll also have a small list of jobs to disable and we should probably prepare that.
These are the main remaining jobs aside from the following out of scope here:
So we should definitely focus on these while Azure folks work on migrating those.
I've also noticed that we'll have to be careful updating the prow deployment specs for the new cluster, because e.g. we gave the secrets clearer names and a different path for the github token.
IMHO, we can remove post-test-infra-upload-boskos-config
. we no longer need to increase the boskos pool and potentially need to shutdown the GCP projects part of it.
Fixing TestGrid upload job today and cleaning up some of the image jobs/references.
IMHO, we can remove post-test-infra-upload-boskos-config. we no longer need to increase the boskos pool and potentially need to shutdown the GCP projects part of it.
agreed, filed https://github.com/kubernetes/test-infra/pull/33121
TestGrid upload progress:
# See https://github.com/GoogleCloudPlatform/testgrid/tree/main/config/print#config-printer for the print utility.
~/go/bin/print gs://k8s-testgrid/configs/k8s/config > k8s-testgrid-config.textproto
~/go/bin/print gs://k8s-testgrid-config/k8s/config > k8s-infra-testgrid-config.textproto
diff k8s-testgrid-config.textproto k8s-infra-testgrid-config.textproto
(And these do have contents):
wc -l k8s-testgrid-config.textproto 519759 k8s-testgrid-config.textproto
wc -l k8s-infra-testgrid-config.textproto 519759 k8s-infra-testgrid-config.textproto
Now following the config merger instructions at https://github.com/kubernetes/test-infra/blob/master/testgrid/merging.md#config-merger. I'll have a few PRs out for those.
Remaining from my list above:
File Path | Job | Link | Uses |
---|---|---|---|
config/jobs/kubernetes/test-infra/test-infra-trusted.yaml | post-test-infra-push-alpine | Search Results | Search Results |
config/jobs/kubernetes/test-infra/test-infra-trusted.yaml | post-test-infra-push-git | Search Results | Search Results |
config/jobs/kubernetes/test-infra/test-infra-trusted.yaml | post-test-infra-push-misc-images | Search Results | Search Results |
post-test-infra-push-alpine just needs minor cleanup, then it can be deleted. post-test-infra-push-git can probably be deleted; the remaining use of it is as the base for certain Prow images. I can't switch them over immediately (integration tests fail when switching from the January image to a recent July image), but I believe switching to an image from the old location will have the same problem. post-test-infra-push-misc-images needs a fix (I think the most recent PR will fix it, but it needs a retrigger to verify that's the case), then the images need to be switched to the new location before the old job is turned down.
(And last bit of cleanup, move all the new image push jobs to the image-pushes dashboard and remove '-canary' from the job name).
post-test-infra-push-misc-images
technically passes, but it doesn't seem to be uploading new images? (I think the same is happening for the new prow images push, which does something similar.)
post-test-infra-push-alpine and post-test-infra-push-git I think we can delete for the reasoning above. The minor cleanup isn't blocking removing the old jobs.
lol I lied, the misc-image canary is working fine. I'll switch those uses over today.
I'm still not seeing new Prow images uploaded to the new location though. (https://prow.k8s.io/view/gs/kubernetes-jenkins/logs/post-k8s-infra-prow-images/1818232059856949248, https://storage.googleapis.com/kubernetes-jenkins/logs/post-k8s-infra-prow-images/1818232059856949248/artifacts/build.log for the build log). Since it's doing something similar to the misc-images push job, I might update it to be similar and see if that fixes it.
Sorry about the confusion, the Prow images job has been working the whole time and I was just confused. (More detail in https://github.com/kubernetes-sigs/prow/pull/217#issuecomment-2266208571).
Anyways, remaining updates are:
I'll leave submission of those to Monday, but those should handle the last test-infra jobs that I think we're actually handling?
secrets | path | job |
---|---|---|
[] | config/jobs/kubernetes/test-infra/test-infra-trusted.yaml | ci-test-infra-gencred-refresh-kubeconfig |
[] | config/jobs/kubernetes/test-infra/test-infra-trusted.yaml | ci-test-infra-rotate-legacy-default-build-sa-json-key |
[] | config/jobs/kubernetes/test-infra/test-infra-trusted.yaml | post-test-infra-deploy-prow |
[] | config/jobs/kubernetes/test-infra/test-infra-trusted.yaml | post-test-infra-gencred-refresh-kubeconfig |
[kubeconfig-prow-services oauth-token] | config/jobs/kubernetes/test-infra/test-infra-trusted.yaml | post-test-infra-reconcile-hmacs |
[oauth-token k8s-ci-robot-ssh-keys] | config/jobs/kubernetes/test-infra/test-infra-trusted.yaml | ci-test-infra-autobump-prow |
[oauth-token k8s-ci-robot-ssh-keys] | config/jobs/kubernetes/test-infra/test-infra-trusted.yaml | ci-test-infra-autobump-prow-for-auto-deploy |
Of those, I think we might need reconcile-hmacs to move along with the new prow deployment?
Otherwise I think rest should probably be spun down just ahead of migrating prow, and remain in the meantime to keep the legacy instance humming.
https://github.com/kubernetes/test-infra/issues/33129 covers the janitor jobs.
We only have these six left now:
ci-test-infra-gencred-refresh-kubeconfig
post-test-infra-deploy-prow
post-test-infra-gencred-refresh-kubeconfig
ci-test-infra-gencred-refresh-kubeconfig
post-test-infra-reconcile-hmacs
ci-test-infra-autobump-prow
ci-test-infra-autobump-prow-for-auto-deploy
post-test-infra-reconcile-hmacs
- Decision: keep until we're ready to migrate prow control plane, job will not migrate. (cc @cjwagner to confirm)
Yes that does not need to migrate assuming that the K8s-Infra Prow is using a GitHub App to manage webhooks (rather than manually configuring them per org or repo) . IIRC someone confirmed this in the last SIG-Testing meeting.
The other decisions SGTM as well.
Now done thanks to Ben: https://github.com/kubernetes/test-infra/pull/33352
There are a few jobs running on the
test-infra-trusted
we should either migrate tok8s-infra-prow-build-trusted
or remove: