Closed smarterclayton closed 4 months ago
/sig instrumentation /sig node /sig scheduling
Hey there @smarterclayton -- 1.19 Enhancements shadow here. I wanted to check in and see if you think this Enhancement will be graduating in 1.19?
In order to have this part of the release:
implementable
stateThe current release schedule is:
If you do, I'll add it to the 1.19 tracking sheet (http://bit.ly/k8s-1-19-enhancements). Once coding begins please list all relevant k/k PRs in this issue so they can be tracked properly. 👍
Thanks!
I don't think we'll make implementable and merged by Tuesday, so should be targeted for 1.20
Hey @smarterclayton Thanks for confirming the inclusion state. I've marked the Enhancement as Deferred
in the Tracker and updating the milestone
accordingly.
/milestone v1.20
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale
/remove-lifecycle stale
Hi @smarterclayton !
Enhancements Lead here, do you still intend to target this for alpha in 1.20?
Thanks! Kirsten
Yes, this is target alpha for 1.20 assuming we can close the remaining questions in the KEP
Thanks Clayton!!
As a reminder, by Enhancements Freeze (October 6th), KEPs must be:
Best, Kirsten
I also added the PR link to the Issue description we can update again once merged.
Hi @smarterclayton :wave:!
I'm one of the Enhancement shadows for the 1.20 release cycle. This is a friendly reminder that the Enhancement freeze is on the 6th of October, i'm repeating the requirements needed by then:
implementable
state.
provisional
at the moment and i see that there's active work ongoing.Thanks!
Thanks for the reminder, updated those. Will be working with the sig.
The current PR looks complete from a enhancements freeze POV, we'll monitor to see if it merges in time.
Hi @smarterclayton
Enhancements Freeze is now in effect. Unfortunately, your KEP PR has not yet merged. If you wish to be included in the 1.20 Release, please submit an Exception Request as soon as possible.
Best, Kirsten 1.20 Enhancements Lead
An exception request was granted, so please ensure that all required changes are made and your PR is merged by the deadline listed in your request.
Thank! Kirsten
cc: @mikejoh
Hello @smarterclayton , 1.20 Docs shadow here 👋🏽. Does this enhancement work planned for 1.20 require any new docs or modification to existing docs?
If so, please follows the steps here to open a PR against dev-1.20
branch in the k/website
repo. This PR can be just a placeholder at this time and must be created before Nov 6th
Also take a look at Documenting for a release to get yourself familiarize with the docs requirement for the release. Thank you!
Hi @smarterclayton
The docs placeholder deadline is almost here. Please make sure to create a placeholder PR against the dev-1.20
branch in the k/website
before the deadline.
Also, please keep in mind the important upcoming dates:
Thank you.
Docs PR is created
Hey @smarterclayton
Does this KEP require both https://github.com/kubernetes/kubernetes/pull/94866 and https://github.com/kubernetes/kubernetes/pull/95839 or just the first?
Thanks! Kirsten
/remove-sig node
/milestone v1.21
@smarterclayton can you please update your KEP to target beta for 1.21? There's a small metadata update and a PRR review that needs to happen I think.
https://github.com/kubernetes/enhancements/pull/2417 is open, thanks for reminding
Hey @smarterclayton , enhancements 1.21 shadow here,
Enhancements Freeze is 2 days away, Feb 9th EOD PST
The enhancements team is aware that KEP update is currently in progress (PR #2417). Please make sure to work on PRR questionnaires and requirements and get it merged before the freeze. For PRR related questions or to boost the PR for PRR review, please reach out in Slack on the #prod-readiness channel.
Any enhancements that do not complete the following requirements by the freeze will require an exception.
With PR https://github.com/kubernetes/enhancements/pull/2417 merged, this enhancement meets all the criteria for the Enhancements freeze 👍
Hi @smarterclayton,
Since your Enhancement is scheduled to be in 1.21, please keep in mind the important upcoming dates:
As a reminder, please link all of your k/k PR(s) and k/website PR(s) to this issue so we can track them.
Thanks!
Hi @smarterclayton,
Enhancements team is marking this enhancement as "At Risk" for the upcoming code freeze due to not seeing any linked k/k PR(s) for this enhancement.
Please make sure to provide all k/k PR(s) and k/website PR(s) to this issue so it can be tracked by the release team.
@JamesLaverack we've determined there are no code changes required for graduation to beta. We are updating documentation.
@ehashman Thank you for the clarification. I've now marked this enhancment as "Tracked" and done for 1.21.
/milestone clear
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
/remove-lifecycle rotten
Canvasing the community to get feedback before GA promotion.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
@smarterclayton I just realized this metric has a pod
label, which IMO increase the cardinality a lot and yield a pressure on the scraper side. Did you hear any concern/feedback from the users? Per the KEP, all the goals can be satisfied by removing the pod
dimension as in terms of a metric, its primary goal is to give a high-level overview on aggregated pods' reqs/limits. Pod-level metric doesn't seem that common. WDYT?
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
/remove-lifecycle rotten
@smarterclayton any plans to graduate this to beta?
If there is no one working on this, we will have to deprecate and remove this stuff. Alternatively, we will need to find someone to graduate this.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/lifecycle frozen I think we might need to look for a new owner to drive this work in 1.26
/lifecycle frozen
@dashpole @logicalhan do you happen to find some volunteers to continue the work?
Oh man, we didn't take this to beta?! This is my fault. Let me talk to @dgrisonnet who pinged me about it a day ago - originally the delay was gathering feedback from admins doing capacity planning, and I had been working with a few people on leveraging it more widely.
The use I was most familiar with was OpenShift and we replaced the dashboards that were using the (old, incorrect, not complete) kube-state-metrics for this - among the folks who did the change there was general agreement that the new metrics were superior and the cost of cardinality was worth it to replace the generally incorrect metrics from kube-state-metrics (at the time we felt that completely replicating the pod resource model code in ksm was not appropriate, and this was a better solution).. Next phase was getting community user input on building metric based capacity dashboards and whether the dimensions worked for the audience. I did a few analysis when planning out e2e CI runs and found the metrics provided better human visibility of comparing bulk "used vs requested".
@Huang-Wei re:
I just realized this metric has a pod label, which IMO increase the cardinality a lot and yield a pressure on the scraper side. Did you hear any concern/feedback from the users? Per the KEP, all the goals can be satisfied by removing the pod dimension as in terms of a metric, its primary goal is to give a high-level overview on aggregated pods' reqs/limits. Pod-level metric doesn't seem that common. WDYT?
The original intent was to allow admins to build capacity planning dashboards, and to pair the resource vs pod level resource metrics (like cpu, memory, etc). So the intent was very much to have a pod dimension. Do we have a proposal to remove or make optional pod level cpu consumption or memory consumption dimensions? If so, such a change would apply to this metric as well, but as this is already an optional endpoint for users who are concerned about cardinality.
To clarify - this is in beta since 1.21 (https://github.com/kubernetes/enhancements/issues/1748#issuecomment-791052241). Was there some belief that it was not beta?
It would be last step to go to GA, I'm happy to push that over the line with @dgrisonnet
I also thought this was still in Alpha for some reason even though we have a label marking the stability :sweat_smile:
Yet let's try to get this over the finish and gather some feedback from users to know if they encountered any issues with these new metrics.
/assign @smarterclayton @dgrisonnet
Do we have a proposal to remove or make optional pod level cpu consumption or memory consumption dimensions? If so, such a change would apply to this metric as well, but as this is already an optional endpoint for users who are concerned about cardinality.
We already have a cardinality protection mechanism in Kubernetes: https://github.com/kubernetes/enhancements/tree/master/keps/sig-instrumentation/2305-metrics-cardinality-enforcement so users could already tweak the dimensions if needed. That, plus the fact that the endpoint is optional, it sounds fairly safe to expose without having to worry about potential cardinality explosions.
Enhancement Description
k/enhancements
) update PR(s):k/k
) update PR(s):k/website
) update(s):