Closed dims closed 10 months ago
cc @kubernetes/sig-architecture-leads @kubernetes/sig-release-leads
@dims can we start by having brownouts of the old registry (they should start immediately)?
Let's aim to very clearly communicate a recommended approach (eg: mirror the images that you depend on, or use a pull through cache, or...) and consider the lead time on those comms when we pick a date.
The comms plan does not have to be perfect, it just has to be good enough.
@sftim agree. Recommended approach, so far:
registry.k8s.io
instead of k8s.gcr.io
@dims can we start by having brownouts of the old registry (they should start immediately)?
@enj yep, agree. The brownout we had in mind was as Arnaud mentioned here: https://kubernetes.slack.com/archives/CCK68P2Q2/p1677793564552829?thread_ts=1677709804.935919&cid=CCK68P2Q2
@enj yep, agree. The brownout we had in mind was as Arnaud mentioned here: https://kubernetes.slack.com/archives/CCK68P2Q2/p1677793564552829?thread_ts=1677709804.935919&cid=CCK68P2Q2
@dims I suppose deleting images is one form of brownout... I was more thinking that we have the old registry return 429
errors every day at noon for a few hours. The transient service disruption will get people's attention.
@enj k8s.gcr.io is GCR based and has only a few folks left to take care of it. Last year some helpful folks tried to setup redirects (automatic from k8s.gcr.io to registry.k8s.io) in a small portion and ran into snags, so we can't do much over there other than delete images.
Details are in this thread: https://kubernetes.slack.com/archives/CCK68P2Q2/p1666725317568709
@dims makes sense. One suggestion that may also not be implementable would be to temporarily delete and then recreate image tags to cause pull failures (another form of brownout).
Year to date GCP Billing data, please see here: GCP_Billing_Report-year-to-date.pdf
($682,683.81 year-to-date / 62 days from Jan 1 to March 4) * 365 = $4,019,025.65 (our budget/credits is $3m )
(edited)
One option we have is to actually delete some images - and then optionally reinstate them per https://github.com/kubernetes/k8s.io/issues/4872#issuecomment-1454951290. A 429 is subject to Google's say-so, but deleting an image is something we can Just Do™. So long as the comms are in place to explain why.
@sftim yes, we will have a list of limited set of images that we will delete ASAP! (and will NOT reinstate them). @hh and folks are coming up with the high traffic / costly image list as the first step. Our comms will depend on what's in that list.
An energetic discussion with @thockin here https://kubernetes.slack.com/archives/CCK68P2Q2/p1678118252030639
I think we can do broad brownouts ahead of any final sunset by toggling the access controls on the 3 backing GCR instances. To make the images public read we set the backing GCS bucket to have read permission for allUsers
, we could probably invert that and put it back on a schedule to gradually increase the period of total non-availability.
Doing this is a big deal, and I'm not sure what the time frame should be. We know that users are very slow to migrate, and that doing this will disrupt their base ""cloud-native"" infrastructure. (E.G. I saw some recent data that Kubernetes 1.11 from 2018 is still reasonably popular (!))
Some data from @justinsb:
Some good discussion with @TheFoxAtWork here: https://cloud-native.slack.com/archives/CSCPTLTPE/p1678219030800149 on #tag-chairs channel on CNCF slack
This will likely break a lot of clusters and organizations, but it is certainly a good wake up call to the world that even open source has its costs. I know this is drastic, but we’ve broken the internet before, this one at least is more well coordinated with plenty of advance warnings. We can’t go to everyone personally, so we do our best with the time and energy we have available to us as open source volunteers and community members. Side note, eliminating older versions and forcing upgrades is a huge global security uplift.
I would also recommend (though this is likely already done) to work with the Ambassadors, Marketing Team, and other Foundations.
@dims i want to confirm what i'm looking at from the chart (i understand there is a new one in the works), can you confirm that each colored bar is who/what is primarily requesting the images? If this is the case, has AWS/Amazon been engaged to redirect requests they field to registry.k8s.io
? have we done this with other cloud providers? ( i know i'm late to the party trying to understand what has already been completed)
@dims @rothgar and I are engaging folks on the AWS side.
@TheFoxAtWork yep, there has been a bunch of back and forth.
Has anyone pinged Microsoft? I don't know where Azure stands at the moment.
A single line kubectl command to find images from the old registry:
A Kyverno and Gatekeeper policy to help folks!
A kubectl/krew plugin:
FAQ(s) we are getting asked:
attempted to pull a lot of the details from this ticket into a single LinkedIn post for sharing in case it helps: https://www.linkedin.com/posts/themoxiefox_action-required-update-references-from-activity-7039245748525256704-IrES
Some good news from @BenTheElder here - https://kubernetes.slack.com/archives/CCK68P2Q2/p1678299674725429
AWS just posted a bulletin in its StackOverflow: https://stackoverflow.com/collectives/aws/bulletins/75676424/important-kubernetes-registry-changes
I chatted with @jeremyrickard at Microsoft. They are all over this.
Question: when the new k8s.gcr.io->registry.k8s.io redirection takes effect, what is likely to fail?
Touching on the topic of network level firewalls or other things causing impact:
This is fairly easily tested - run a pod which uses a "registry.k8s.io" image in your cluster(s). If it is able to pull that image, you're almost certainly OK. If not, debug now before the redirect goes live (next week, we hope).
How will the redirect work? Just on DNS level? I have tried this locally myself, but containerd/Docker, obviously and for the right reasons, complains about certificate mismatch between k8s.gcr.io and registry.k8s.io. I solved it then by downloading the ca.crt and installing it locally for containerd/Docker.
Some good news from @BenTheElder here - https://kubernetes.slack.com/archives/CCK68P2Q2/p1678299674725429
Do we have enough bandwidth on registry.k8s.io ?
How will the redirect work? Just on DNS level? I have tried this locally myself, but containerd/Docker, obviously and for the right reasons, complains about certificate mismatch between k8s.gcr.io and registry.k8s.io. I solved it then by downloading the ca.crt and installing it locally for containerd/Docker.
HTTP 3XX redirect, not DNS. No cert changes.
You can test by taking any image you would pull and substituting registry.k8s.io
instead of k8s.gcr.io
. All images in k8s.gcr.io are in registry.k8s.io.
The only difference between doing this test and the redirect will be your client reaching k8s.gcr.io first and then following the redirect, but presumably k8s.gcr.io was already reachable for you if you're switching, and all production-grade registry clients follow HTTP redirects.
The same existing GCR endpoint will serve the redirect instead of the usual response. Existing GCR image pulls already involve redirects to backing storage, just not redirects to registry.k8s.io
Do we have enough bandwidth on registry.k8s.io ?
We should have more than enough capacity on https://registry.k8s.io, we've looked at traffic levels for k8s.gcr.io and planned accordingly. We aren't hitting bandwidth limits on GCR either, just impractical cost of serving ever-increasing cross-cloud bandwidth.
registry.k8s.io gives us the ability to offload bandwidth-intensive image layer serving to additional hosts securely. We're doing that on GCP (Artifact Registry, Cloud Run) and now AWS (S3) thanks to additional funding from Amazon and we will be serving substantially less expensive egress traffic. In the future it might include additional hosts / sponsors (https://registry.k8s.io#stability).
Just serving AWS traffic (which is the majority) from region-local AWS storage should bring us back within our budgets.
We have a lot more context in the docs (https://registry.k8s.io) and this talk https://www.youtube.com/watch?v=9CdzisDQkjE
@BenTheElder 👍
Experiment results for redirect k8s.gcr.io->registry.k8s.io last october: https://kubernetes.slack.com/archives/CCK68P2Q2/p1666725317568709
this text may get dropped from the blog post being drafted for automatic redirects, so saving it here:
Technical Details
The new registry.k8s.io is a secure blob redirector that allows the Kubernetes project to direct traffic based on request IP to the best possible blob storage for the user. If a user makes a request from an AWS region network and pulls a Kubernetes container image, for example, that user will be automatically redirected to pull an image from the closest S3 bucket image layer store. For the current decision tree, refer to this architecture decision tree [1]. To be clear, the new registry.k8s.io implementation allows the upstream project to host registries on more clouds in the future, not just GCP and AWS, which will increase stability, reduce cost, and accelerate bothspeed downloads and deployments. Please do not rely on the internal implementation details of the new image registry as these can be changed without notice.
Please note the upstream Kubernetes teams are working to provide additional communication, and the situation around how long the old registry remains is still being discussed.
[1]: https://kubernetes.io/blog/2023/02/06/k8s-gcr-io-freeze-announcement/ [2]: https://github.com/kubernetes/registry.k8s.io/blob/main/cmd/archeio/docs/request-handling.md
The first step for minikube will be to start adding --image-repository=registry.k8s.io
to the old kubeadm
commands.
Probably add it to all kubeadm versions before 1.25.0, shouldn't hurt anything if it is already the default registry...
The second step is to retag all the older preloads with the new registry, to work air-gapped (but rather small download)
Some mirrors might still use a "k8s.gcr.io" subdirectory, which is fine, so this change is only for the default registry.
Main issue is that those people who are pulling those older kubernetes releases, also use older versions of minikube.
Or if we invalidate old caches, and have people pull "new" versions of the same images - but with a different name...
~/.minikube/cache/images/amd64 : k8s.gcr.io/pause_3.6
-> registry.k8s.io/pause_3.6
That would be somewhat contra-productive, so trying to "upgrade" those old caches in place (by re-tagging images)
kubeadm had the default changed in patch releases back to 1.23 (older releases were not accepting any patches), when we published https://kubernetes.io/blog/2022/11/28/registry-k8s-io-faster-cheaper-ga/
So on March 20, we'll be turning on redirects for almost everyone from k8s.gcr.io to registry.k8s.io, details here: https://kubernetes.io/blog/2023/03/10/image-registry-redirect/
So the next question will be, how may folks still be using the underlying content of k8s.gcr.io from other ways:
So we'll have to then watch how much savings we get over time. Assuming about a week of roll out starting March 20, we'll get some concrete data a week or so after that ( lets' say April 3rd - monday given we have a saw tooth pattern of usage over the week with lows on saturday and sunday )
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/reopen
/remove-lifecycle rotten
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
/reopen
We did this /close
@sftim: Reopened this issue.
@sftim: Closing this issue.
(but feel free to reopen if needed)
Here are the community blogs and announcements so far around k8s.gcr.io
However we are finding out that the numbers don't add up and we will end up using all the budget we have as our GCP cloud credits well before Dec 31, 2023. So we need to do something more drastic than just the freeze. Please see the thread in
#sig-k8s-infra
: https://kubernetes.slack.com/archives/CCK68P2Q2/p1677793138667629?thread_ts=1677709804.935919&cid=CCK68P2Q2We will need to start by enumerating some of images that carry the biggest cost (storage+network) and removing them from
k8s.gcr.io
right away (possibly by freeze date - April 3rd). Some data is in the thread, but we will need to revisit the logs and come up with a clear set of images based on some criteria, announce their deletion as well. Note that these specific set of images will still be available in the new registryregistry.k8s.io
So folks will have to fix their kubernetes manifests / helm charts etc as we mentioned in the 3 urls above.Thought about deadline for deletion of k8s.gcr.io: Since the freeze is on April 3rd 2023 (10 days before 1.27 is released) and we expect to send comms out at kubecon EU ( 18 – 21 APRIL ). How about we put the marker on end of June? (So we get 6 months of cost savings on the costs)
Risk: We will end up interrupting clusters that are working right now. Specifically given the traffic patterns, a bunch of these will be in AWS, but is very likely to be anyone who has an older working cluster that they haven't touched in a while.
What i have enumerated above is just the beginning of the discussion. Please feel free to add your thought below, so we can then draft a KEP around it.