kubernetes / enhancements

Enhancements tracking repo for Kubernetes
Apache License 2.0
3.45k stars 1.49k forks source link

QoS-class resources #3008

Open marquiz opened 3 years ago

marquiz commented 3 years ago

Enhancement Description

Please keep this description up to date. This will help the Enhancement Team to track the evolution of the enhancement efficiently.

kikisdeliveryservice commented 3 years ago

Thanks for opening this @marquiz !! :smile:

To ensure that the sig is aware of and that communication has begun regarding this KEP, please add the mandatory Discussion Link to the Description above. For ref it is a "link to SIG mailing list thread, meeting, or recording where the Enhancement was discussed before KEP creation"

kad commented 3 years ago

@kikisdeliveryservice the topic was discussed on SIG-Node on 2021-10-19. Meeting minutes: https://docs.google.com/document/d/1Ne57gvidMEWXR70OxxnRkYquAoMpt56o75oZtg-OeBg

pacoxu commented 2 years ago

/sig node

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

marquiz commented 2 years ago

/remove-lifecycle stale

Priyankasaggu11929 commented 2 years ago

/milestone v1.25

Atharva-Shinde commented 2 years ago

Hello @marquiz :wave:, 1.25 Enhancements team here!

Just checking in as we approach enhancements freeze on 18:00 PST on Thursday June 16, 2022. For note, This enhancement is targeting for stage alpha for 1.25 release

Here’s where this enhancement currently stands:

It looks like for this one, we would need to:

Open PR https://github.com/kubernetes/enhancements/pull/3004 addressing ^

For note, the status of this enhancement is marked as at risk. Please keep the issue description up-to-date with appropriate stages as well. Thank you!

Atharva-Shinde commented 2 years ago

/stage alpha

Atharva-Shinde commented 2 years ago

Hello @marquiz πŸ‘‹, just a quick check-in again.

The enhancements freeze for 1.25 starts on this Thursday, June 16, 2022 at 18:00 PM PT.

Please try to get the above mentioned action-items done before enhancements freeze :)

Note: the current status of the enhancement is still marked at-risk.

marquiz commented 2 years ago

Thanks @Atharva-Shinde for the help!

I now did the following updates:

We'll review this in SIG-Node tomorrow so more updates after that.

Atharva-Shinde commented 2 years ago

Hey @marquiz πŸ‘‹ A good news! Enhancements Freeze is now extended to next week till Thursday June 23, 2022 πŸš€ So we now have one more week to submit the KEP :)

Priyankasaggu11929 commented 2 years ago

Hello @marquiz πŸ‘‹, just a quick check-in again, as we approach the 1.25 enhancements freeze.

Please plan to get the open PR https://github.com/kubernetes/enhancements/pull/3004 merged before enhancements freeze on Thursday, June 23, 2022 at 18:00 PM PT which is just over 3 days away from now.

For note, the current status of the enhancement is atat-risk. Thank you!

Priyankasaggu11929 commented 2 years ago

Hello, 1.25 Enhancements Lead here πŸ‘‹. With Enhancements Freeze now in effect, this enhancement has not met the criteria for the freeze and has been removed from the milestone.

As a reminder, the criteria for enhancements freeze is:

Feel free to file an exception to add this back to the release. If you plan to do so, please file this as early as possible.

Thanks! /milestone clear

marquiz commented 2 years ago

Hi @Atharva-Shinde @Priyankasaggu11929 I've retitled the PR (#3004) in order to reduce confusion misconceptions wrt some other KEPs and earlier work. Is it ok to retitle this issue as well?

Priyankasaggu11929 commented 2 years ago

Hello @marquiz, retitling the issue is perfectly fine. Thank you! :)

marquiz commented 2 years ago

/retitle QoS-class resources

sftim commented 2 years ago

I have a query: how is this different from the built-in cpu resource? Linux blockio lets you configure controls such as blkio.throttle.read_bps_device and similarly, for CPU you can define requests and limits.

If the blockio case is like the existing cpu approach, then I'm wary of permanently complicated the Kubernetes Pod API to support a particular, vendor specific technology.

If we want to let different Pods share resources, we should aim to make a much more generic mechanism. For example, allow two different Pods in the same namespace to aggregate their cpu limit, agreeing between those two Pods to co-operate if they are scheduled onto the same node. Once we can share cpu limits, we can look at extending that sharing to other kinds of resource such as an extended resource.

At the very least, I'd like to see the sort of thing I'm proposing clearly called out as an alternative in the KEP, before we merge it.

marquiz commented 2 years ago

Hi @sftim, thanks for the review!

I have a query: how is this different from the built-in cpu resource? Linux blockio lets you configure controls such as blkio.throttle.read_bps_device and similarly, for CPU you can define requests and limits.

Blkio is just one possible usage for this. At least one fundamental difference between blkio and cpu is that the "amount of blkio" is not (ac)countable in any meaningful way. For cpu we know how much there is and there are meaningful controls to allocate a portion of that. For blkio its more of throttling: there are potentially a multitude of devices which is hard to predict which ones are actually used by a pod and potentially all of the different storage devices have different characteristics (parameters), think about SSD vs. rotational drives etc.

If the blockio case is like the existing cpu approach, then I'm wary of permanently complicated the Kubernetes Pod API to support a particular, vendor specific technology.

There isn't anything vendor specific in this proposal. One example is an Intel technology but even that is based on a generic interface in the Linux kernel (resctrlfs) that also other vendors' corresponding technologies use.

If we want to let different Pods share resources, we should aim to make a much more generic mechanism. For example, allow two different Pods in the same namespace to aggregate their cpu limit, agreeing between those two Pods to co-operate if they are scheduled onto the same node. Once we can share cpu limits, we can look at extending that sharing to other kinds of resource such as an extended resource.

I wouldn't identify this as a resource sharing mechanism between pods. Yes, in some cases they might end up using the same resource but generally that's not the case. In the case of blockio the class would just specify the throttling/weight parameters for storage devices but it doesn't state anything what particular devices are used by a pod. Similarly for RDT, the class might determine what portion of cache it can use or how much memory bandwidth it can use but it doesn't say anything about which CPUs the pod is running on (i.e. which cache IDs it is using). In theses cases wwo pods belonging in the same class generally means that they have the "same level of throttling"

At the very least, I'd like to see the sort of thing I'm proposing clearly called out as an alternative in the KEP, before we merge it.

At least for now I think they are two different things.

marosset commented 2 years ago

/milestone v1.26 /label lead-opted-in (I'm doing this on behalf of @ruiwen-zhao / SIG-node)

sftim commented 2 years ago

To clarify why I think blkio is vendor-specific: only Linux nodes have this resource. Windows nodes have CPU and memory but they don't have blkio or a direct equivalent.

I'd like the KEP to make the difference clear to a reader who knows Kubernetes but isn't particular familiar with any of the QoS mechanisms that we propose to integrate with.

Atharva-Shinde commented 2 years ago

Hey @marquizπŸ‘‹, 1.26 Enhancements team here!

Just checking in as we approach Enhancements Freeze on 18:00 PDT on Thursday 6th October 2022.

This enhancement is targeting for stage alpha for 1.26

Here's where this enhancement currently stands:

For this KEP, we would need to:

The status of this enhancement is marked as at risk. Please keep the issue description up-to-date with appropriate stages as well. Thank you :)

Atharva-Shinde commented 2 years ago

Hello @marquiz πŸ‘‹, just a quick check-in again, as we approach the 1.26 Enhancements freeze.

Please plan to get the action items mentioned in my comment above done before Enhancements freeze on 18:00 PDT on Thursday 6th October 2022 i.e tomorrow

For note, the current status of the enhancement is marked at-risk :)

rhockenbury commented 2 years ago

Hello πŸ‘‹, 1.26 Enhancements Lead here.

Unfortunately, this enhancement did not meet requirements for enhancements freeze.

If you still wish to progress this enhancement in v1.26, please file an exception request. Thanks!

/milestone clear /label tracked/no /remove-label tracked/yes /remove-label lead-opted-in

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

kad commented 1 year ago

/remove-lifecycle rotten

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

kad commented 1 year ago

/remove-lifecycle stale The KEP is actively reviewed, and part of 1.28 SIG-Node plan

SergeyKanzhelev commented 1 year ago

/milestone v1.28

SergeyKanzhelev commented 1 year ago

/label lead-opted-in

npolshakova commented 1 year ago

Hi @marquiz πŸ‘‹, Enhancements team here!

Just checking in as we approach enhancements freeze on 01:00 UTC Friday, 16th June 2023.

This enhancement is targeting for stage alpha for 1.28 (correct me, if otherwise.)

Here's where this enhancement currently stands:

It looks like https://github.com/kubernetes/enhancements/pull/3004 will address most of these issues!

The status of this enhancement is marked as at risk. Please keep the issue description up-to-date with appropriate stages as well. Thank you!

npolshakova commented 1 year ago

Hi @marquiz, just reaching out again before the enhancements freeze on 01:00 UTC Friday, 16th June 2023. This enhancement is currently at risk. It looks like https://github.com/kubernetes/enhancements/pull/3004 will address most of the requirements. Let me know if I missed anything. Thanks!

Atharva-Shinde commented 1 year ago

Hello πŸ‘‹, 1.28 Enhancements Lead here. Unfortunately, this enhancement did not meet requirements for v1.28 enhancements freeze. Feel free to file an exception to add this back to the release tracking process. Thanks!

Atharva-Shinde commented 1 year ago

/milestone clear

AdminTurnedDevOps commented 1 year ago

Hey @marquiz

1.28 Docs Shadow here.

Does this enhancement work planned for 1.28 require any new docs or modification to existing docs?

If so, please follows the steps here to open a PR against dev-1.28 branch in the k/website repo. This PR can be just a placeholder at this time and must be created before Thursday 20th July 2023.

Also, take a look at Documenting for a release to get yourself familiarize with the docs requirement for the release.

Thank you!

Rishit-dagli commented 1 year ago

@AdminTurnedDevOps This is removed from 1.28 release so I have marked this as "removed form release" and there is no need for 1.28 docs PR

npolshakova commented 1 year ago

Saw this was removed from milestone, will update the enhancement tracking!

npolshakova commented 1 year ago

/milestone clear

SergeyKanzhelev commented 1 year ago

/milestone v1.29

npolshakova commented 1 year ago

Hello @marquiz πŸ‘‹, 1.29 Enhancements team here!

Just checking in as we approach enhancements freeze on 01:00 UTC, Friday, 6th October, 2023.

This enhancement is targeting for stage alpha for 1.29 (correct me, if otherwise)

Here's where this enhancement currently stands:

For this KEP, it looks like https://github.com/kubernetes/enhancements/pull/3004 will address most of these issues. Please update the latest-milestone to v1.29 and the alpha milestone to 1.29 in this PR.

The status of this enhancement is marked as at risk for enhancement freeze. Please keep the issue description up-to-date with appropriate stages as well. Thank you!

npolshakova commented 1 year ago

Hi @marquiz, just checking in once more as we approach the 1.29 enhancement freeze deadline this week on 01:00 UTC, Friday, 6th October, 2023. The status of this enhancement is marked as at risk for enhancement freeze.

It looks like https://github.com/kubernetes/enhancements/pull/3004 will address most of the requirements. Let me know if I missed anything. Thanks!

npolshakova commented 1 year ago

Hello πŸ‘‹, 1.29 Enhancements Lead here. Unfortunately, this enhancement did not meet requirements for v1.29 enhancements freeze. Feel free to file an exception to add this back to the release tracking process. Thanks!

/milestone clear

salehsedghpour commented 10 months ago

/remove-label lead-opted-in

SergeyKanzhelev commented 9 months ago

/stage alpha /milestone v1.30

salehsedghpour commented 9 months ago

Hello @marquiz , 1.30 Enhancements team here! Is this enhancement targeting 1.30? If it is, can you follow the instructions here to opt in the enhancement and make sure the lead-opted-in label is set so it can get added to the tracking board? Thanks!

salehsedghpour commented 9 months ago

/milestone clear

k8s-triage-robot commented 6 months ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

kad commented 6 months ago

/remove-lifecycle stale

k8s-triage-robot commented 1 month ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

kundan2707 commented 1 month ago

/remove-lifecycle stale