Open byako opened 1 year ago
Thank you for the explanation! The scalability question does sound tricky, as a cluster administrator I can understand Wojciech's point that it's hard to stay abreast of new features and evaluate in advance whether they'll cause problems. I can also imagine a worst case where some highly-replicated pods are updated to new containers, but the registry is overloaded and, say, trickles 1 byte / second to every node, which could generate # nodes * # pods
events every minute.
Did you look into whether there is any backpressure mechanism that could protect the apiserver in a case like this? Our clusters are small so I'm not familiar with these techniques, but API Priority & Fairness seems like it could enforce a cluster-level limit on the rate of progress events. Maybe that would address the question about a protection mechanism. However, I don't know where this traffic would belong - the default levels seem focused on requests that are important for normal cluster operation, rather than informative log events like this, which are primarily of interest to humans observing the cluster. Maybe we need a new node-low
priority level?
Please ignore this question if it's an unhelpful tangent!
I have not checked any protection measures available yet, but I'll have a look at Priority and Fairness doc, thank you. If I'm not mistaken, current suggestion was to only publish events when something has subscribed to it. Details are to be defined.
/remove-label lead-opted-in
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
+1
/remove-lifecycle rotten
I for one really feel blind not knowing the progress of an image pull. It really would be useful to have some insights into the speed being pulled (so I can also troubleshoot network related problems), some form of percentage progress, and maybe an estimate of completion.
Enhancement Description
k/enhancements
) update PR(s): KEP-3542: СRI image pulling with progress notificationk/k
) update PR(s): https://github.com/kubernetes/kubernetes/pull/118326k/website
) update PR(s):Please keep this description up to date. This will help the Enhancement Team to track the evolution of the enhancement efficiently.