kubernetes / enhancements

Enhancements tracking repo for Kubernetes
Apache License 2.0
3.45k stars 1.49k forks source link

Topology Aware Routing #2433

Open robscott opened 3 years ago

robscott commented 3 years ago

Enhancement Description

/sig network

robscott commented 3 years ago

/assign

kendallroden commented 3 years ago

Hi @robscott , Since your Enhancement is scheduled to be in 1.21, please keep in mind the important upcoming dates:

Thanks!

JornShen commented 3 years ago

@robscott What' the develop process of this feature? I am interested in it. Is there something I can help to do ? :)

robscott commented 3 years ago

Hey @JornShen thanks for checking in! If you have time, I'd really appreciate any review or testing of my related PR: https://github.com/kubernetes/kubernetes/pull/99522.

kendallroden commented 3 years ago

Hey @robscott! I know you are aware based on other enhancement work that code freeze is coming up on March 9th EOD PST and if any PRs are not merged by the deadline, you'll have to request an exception! Please also keep in mind that if this enhancement requires new docs or modification to existing docs, you'll need to follow the steps in the Open a placeholder PR doc to open a PR against k/website repo by March 16th EOD PST Thanks!

annajung commented 3 years ago

Hi @robscott, with https://github.com/kubernetes/kubernetes/pull/99522 merged in, we will mark this as code complete for 1.21 release.

reylejano commented 3 years ago

Hello @robscott , 1.21 Docs lead here. Does this enhancement work planned for 1.21 require any new docs or modification to existing docs? If so, please follows the steps here to open a PR against dev-1.21 branch in the k/website repo. This PR can be just a placeholder at this time and must be created by March 16 EOD PST Also take a look at Documenting for a release to get yourself familiarize with the docs requirement for the release. Thank you!

robscott commented 3 years ago

Hey @reylejano, thanks for checking in! This will require some significant docs additions. I'll make sure to have a docs PR ready before that deadline.

JamesLaverack commented 3 years ago

/milestone v1.22 /stage beta

JamesLaverack commented 3 years ago

Hey @robscott, 1.22 Enhancements Lead here,

I have a few questions about your enhancement, as it is currently targeted for 1.22.

I'm assuming you're targeting beta for this release (if you're not please let me know!) in which case we will also require:

We require these merged into master by enhancements freeze at 23:59:59 on Thursday 13th May, or else require an exception after this deadline.

Please do reach out if you have any questions.

robscott commented 3 years ago

Hey @JamesLaverack, sorry I missed this! The KEP updates for 1.22 have merged. The PRR was actually already completely for the alpha release.

Relevant KEP PRs that have merged:

JamesLaverack commented 3 years ago

Great, thanks Rob. As soon as https://github.com/kubernetes/enhancements/pull/2714 merges I think you're good for enhancements freeze.

robscott commented 3 years ago

Thanks for the help @JamesLaverack, I think that means this one should be good to go now.

JamesLaverack commented 3 years ago

Great. This is all set for 1.22 enhancements freeze. :)

JamesLaverack commented 3 years ago

Hey @robscott can you confirm what k/k PRs are in-flight or merged for 1.22? I did see https://github.com/kubernetes/kubernetes/pull/100807 but I'm unsure if it's for this enhancement or not, or if there are other PRs I've missed.

robscott commented 3 years ago

Hey @JamesLaverack, thanks for checking in! Although I'm hoping to get a number of the beta requirement in for this cycle, including e2e tests, I don't think I'll have the capacity to get everything in. That means this enhancement will likely be stuck in alpha for one more cycle.

JamesLaverack commented 3 years ago

Thanks for the update @robscott. We can keep an eye on this as we approach code freeze.

JamesLaverack commented 3 years ago

Hey all, friendly reminder that code freeze is tomorrow at 18:00 PDT on the 8th of July. At the moment this enhancement is at risk because https://github.com/kubernetes/kubernetes/pull/100807 has not yet merged. This PR must be approved by the deadline for this enhancement to remain in v1.22.

I'm aware that you said it was unlikely to make it @robscott. If you think that's still the case we can remove this from the milestone (to avoid further pinging).

Additionally, please remember that the docs placeholder PR is the day after code freeze.

robscott commented 3 years ago

Thanks for checking in @JamesLaverack! I also had been working on a branch that was similar to @aojea's PR linked above. Unfortunately this ended up being more complicated than expected so I think this enhancement will not see any significant changes in this cycle 😞. I'm hoping to get these tests in earlier in the next cycle though.

JamesLaverack commented 3 years ago

Thank you for the update @robscott. I'll remove this from the v1.22 milestone if you're not going to be pushing any changes for this cycle.

/milestone clear

thockin commented 3 years ago

This is currently specced to GA in 1.25, so no change for 1.23

thockin commented 3 years ago

My bad. Shooting for BETA in 23

salaxander commented 3 years ago

/milestone v1.23

salaxander commented 3 years ago

Hi @robscott! 1.23 Enhancements team here. Just checking in as we approach enhancements freeze at 11:59pm PST on Thursday 09/09. Here's where this enhancement currently stands:

Looks like we're all set for enhancements freeze!

Thanks!

jlbutler commented 3 years ago

Hi @robscott :wave: 1.23 Docs team here.

This enhancement is marked as 'None Needed' for docs for the 1.23 release, but as it's graduating to beta we likely need a PR for the move from alpha. If I'm mistaken, just let me know! Otherwise...

Please follow the steps detailed in the documentation to open a PR against the dev-1.23 branch in the k/website repo. This PR can be just a placeholder at this time and must be created before Thu November 18, 11:59 PM PDT.

Also, if needed take a look at Documenting for a release to familiarize yourself with the docs requirement for the release.

Thanks!

/cc @nate-double-u

salaxander commented 3 years ago

Hi @robscott - Just a quick reminder that today is code freeze. Looks like we're still waiting on https://github.com/kubernetes/kubernetes/pull/106433 merging?

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

robscott commented 2 years ago

/remove-lifecycle stale /lifecycle frozen

robscott commented 2 years ago

@thockin I think the milestone on this needs to be bumped until 1.26 since we only enabled it by default in 1.24.

thockin commented 2 years ago

Hopefully GA in 26

dudicoco commented 2 years ago

It seems that there is no documentation currently on how to use this feature.

The docs state:

Cluster components such as the kube-proxy can then consume those hints, and use them to influence how the traffic is routed (favoring topologically closer endpoints).

How do we use these hints to route to services? Is this feature related to the kube-proxy ipvs-scheduler flag?

robscott commented 2 years ago

@dudicoco I'm not sure I understand what you're asking. You said "there is no documentation currently on how to use this feature" and then in the next sentence link to the docs for this feature. Is there something specific that we should add?

How do we use these hints to route to services?

I think we want to be at least somewhat vague here to leave room for the underlying implementation to change, but I think the docs for how kube-proxy interprets these hints are relatively helpful.

Is this feature related to the kube-proxy ipvs-scheduler flag?

No. Is there anything we can clean up/remove that suggests a connection?

dudicoco commented 2 years ago

@robscott thanks for the info.

I guess my confusion derives from the previous feature - https://kubernetes.io/docs/concepts/services-networking/service-topology/. With Topology-aware traffic routing you could control the service routing via the service object, so I expected Topology-aware hints to work similarly.

So the with the new feature, kube-proxy automatically routes to the endpoint in the same zone, without any configuration to the service objects?

In addition, with the previous feature you could route to endpoints on the same host, is this functionality now gone? Only zone routing is taking place?

Regarding the kube-proxy ipvs-scheduler flag, nothing in the docs suggest that there is a connection, it's just that the shortest expected delay mode seems to provide a similar functionality. I have never used that mode though and i'm not sure if it even works out of the box without additional components reporting the latency to kube-proxy.

sftim commented 2 years ago

It would be nice to add a tutorial page to explain topology aware hints (maybe on a local Minikube cluster with a small number of simulated nodes?) - however, it's OK for the feature to graduate without that tutorial being started.

thockin commented 2 years ago

Candidate for GA in 1.26

rhockenbury commented 2 years ago

/label tracked/yes /remove-label tracked/no /stage stable

rhockenbury commented 2 years ago

Hello @robscott 👋, 1.26 Enhancements team here.

Just checking in as we approach enhancements freeze on 18:00 PDT on Thursday 6th October 2022.

This enhancement is targeting for stage stable for 1.26 (correct me, if otherwise)

Here's where this enhancement currently stands:

For this KEP, please plan to open a PR to update the KEP yaml, KEP readme and PRR file.

The status of this enhancement is marked as at risk. Please keep the issue description up-to-date with appropriate stages as well. Thank you!

robscott commented 2 years ago

Hey @rhockenbury, thanks for checking in! I've created https://github.com/kubernetes/enhancements/pull/3572 to update the KEP.

rhockenbury commented 2 years ago

To meet the requirements, we would also need to update the KEP readme in #3572 to use latest test plan template.

robscott commented 2 years ago

Thanks for catching that @rhockenbury, I've updated the KEP to include the latest test plan template. Unfortunately https://storage.googleapis.com/k8s-triage/index.html was not working for me. Regardless of the input I tried, I got "0 clusters of 0 failures out of 0 builds from Invalid Date to Invalid Date."

rhockenbury commented 2 years ago

We would still need to add the test update agreement which needs to be checked and included in the kep readme.

rhockenbury commented 2 years ago

Thanks! Marked as tracked for v1.26.

jonathon2nd commented 2 years ago

In addition, with the previous feature you could route to endpoints on the same host, is this functionality now gone? Only zone routing is taking place?

I am also trying to see about using Topology Aware Hints because Topology-aware traffic routing with topology keys has been deprecated. We were using it to ensure optimal routing for high bandwidth services, to keep traffic on the same node if possible.

Is there doc that lists the various different hints that could be set? I am hoping that there is forHosts or something of the like.

Thanks in advanced, if I am missing something obvious I am sorry.

jonathon2nd commented 2 years ago

Wait the only hint is ForZone? https://pkg.go.dev/k8s.io/api/discovery/v1#EndpointHints

That makes this , not really accurate, right? Or am I missing something obvious?

Note: This feature, specifically the alpha topologyKeys API, is deprecated since Kubernetes v1.21. Topology Aware Hints, introduced in Kubernetes v1.21, provide similar functionality.

sftim commented 2 years ago

@jonathon2nd we could perhaps stress that “similar” is not “equivalent”. This enhancement issue specifically tracks Topology Aware Hints.

You could raise an issue against k/website if you'd like to work on improving the explanation, or want to encourage other contributors to do so.

robscott commented 2 years ago

@jonathon2nd Do you mind sharing your use cases for same node routing on https://github.com/kubernetes/enhancements/pull/3293?

mickeyboxell commented 2 years ago

@robscott Are additional docs required for this KEP? If so, will there be a new PR for k/website?

Atharva-Shinde commented 2 years ago

Hey @robscott 👋,

Checking in as we approach 1.26 code freeze at 17:00 PDT on Tuesday 8th November 2022.

Please ensure the following items are completed:

As always, we are here to help should questions come up. Thanks :)

mickeyboxell commented 2 years ago

Hi @robscott 👋 This enhancement is marked as Needs Docs for 1.26 release. Please follow the steps detailed in the documentation to open a PR against dev-1.26 branch in the k/website repo. This PR can be just a placeholder at this time. It must be created by November 9. For more information, take a look at Documenting for a release to familiarize yourself with the docs requirement for the release.

robscott commented 2 years ago

Hey @Atharva-Shinde and @mickeyboxell, we are working on some incremental updates for this KEP in the 1.26 cycle, but the required doc updates should be minimal. Unfortunately we don't have sufficient capacity to graduate to GA in this cycle. The first PR related to this KEP is https://github.com/kubernetes/kubernetes/pull/113556, I think we'll have at least one more PR to k/k in this cycle + one small k/website PR.