knative / serving

Kubernetes-based, scale-to-zero, request-driven compute
https://knative.dev/docs/serving/
Apache License 2.0
5.56k stars 1.16k forks source link

Add Support for EndpointSlices #7701

Open robscott opened 4 years ago

robscott commented 4 years ago

/area networking /kind feature

Describe the feature

It would be great if Knative could support EndpointSlices. This is a new Kubernetes resource that provides a scalable and extensible alternative to Endpoints. Although there may be more cases where this pattern is used, I've noticed that Knative manually creates Endpoints in some cases. With the EndpointSliceProxying feature gate enabled, kube-proxy will read exclusively from EndpointSlices, unfortunately disregarding any Endpoints without any corresponding EndpointSlices and effectively breaking parts of Knative. I've been the main developer for this feature in Kubernetes and I'm happy to answer any questions or help anyone that's interesting in adding support for this.

/cc @Cynocracy

tcnghia commented 4 years ago

cc @vagababov @markusthoemmes

vagababov commented 4 years ago

/assign

It's quite interesting that this was the exact question I asked in Barcelona during the presentation of the API (what will happen to the APIs that manually create endpoints) and it was completely ignored by the presenters :-)

vagababov commented 4 years ago

Now to the actual problem at hand, @robscott , do we basically have to create both endpoints and endpointslices mirroring the contents?

robscott commented 4 years ago

@vagababov Yep, that's what would be needed. We're also looking into ways we can mirror custom Endpoints to EndpointSlices automatically, but any automatic approach will be imperfect. As an example, EndpointSlices have topology fields that can't be derived from Endpoints resources. On the other hand, if you're not interested in anything specific with EndpointSlices, this potential automatic mirroring approach may suffice. I'll follow up here with more details as I have them.

vagababov commented 4 years ago

Yeah, I am not sure we have any use for topology just yet. Might be interesting in future with NLS work or to pick activators from the same zone as nodes, etc, but right now, just copy of IP addresses.

markusthoemmes commented 4 years ago

I wondered about this too, especially looking at #7260. We could have a more efficient solution there, not relying on subset ordering in the endpoints object if we had an endpointslice just for the activator IPs potentially.

It's only beta in 1.17 though, what's the availability of the API currently?

robscott commented 4 years ago

The API itself is enabled by default in 1.17, the controller is enabled by default in 1.18, and kube-proxy is not configured to use EndpointSlices by default yet in any version. For now this will only come up for users that intentionally enable the EndpointSliceProxying feature gate in 1.18 or EndpointSlice in feature gate on kube-proxy in 1.16-1.17. As I've looked into this further, it looks like we should be able to solve this issue automatically with some kind of mirroring in 1.19 (before this feature is enabled by default).

It may be worth adding some kind of note in the documentation here that Knative is currently incompatible with the EndpointSliceProxying feature gate. I'll also be updating EndpointSlice docs to note the issue, and hope to have a more automatic mirroring solution in place for 1.19. I'm happy to close this issue for now as well since it sounds like you actually shouldn't need to do anything for EndpointSlice integration.

vagababov commented 4 years ago

Let's keep it anyway, even if no integration will be required anyway. Since as Markus noted we might have some interesting usages for slices ourselves.

vagababov commented 4 years ago

Thanks for the help, Rob.

robscott commented 4 years ago

Quick update on this feature. A new EndpointSliceMirroring controller has been added as part of the 1.19 release and along with that kube-proxy now reads from EndpointSlices on Linux by default.

vagababov commented 4 years ago

So basically this is a noop for us when ES are enabled by default?

robscott commented 4 years ago

Yep, this should be a noop. It does mean there will be a slightly longer delay since the new mirroring controller watches Endpoints resources and then creates EndpointSlices. It also only supports mirroring up to 1000 IPs per subset.

github-actions[bot] commented 4 years ago

This issue is stale because it has been open for 90 days with no activity. It will automatically close after 30 more days of inactivity. Reopen the issue with /reopen. Mark the issue as fresh by adding the comment /remove-lifecycle stale.

vagababov commented 3 years ago

/reopen /lifecycle frozen

knative-prow-robot commented 3 years ago

@vagababov: Reopened this issue.

In response to [this](https://github.com/knative/serving/issues/7701#issuecomment-727293198): >/reopen >/lifecycle frozen Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
vagababov commented 3 years ago

/unassign

evankanderson commented 3 years ago

As I read the issue:

  1. It's a no-op (and working post 1.19 with the mirroring controller)
  2. There's no one currently working on anything here
  3. We have some ideas in the future, but haven't written any of them down.

I think we should

/close

this issue, and open new ones when we have concrete proposals where using EndpointSlices directly might help.

knative-prow-robot commented 3 years ago

@evankanderson: Closing this issue.

In response to [this](https://github.com/knative/serving/issues/7701#issuecomment-803755604): >As I read the issue: > >1. It's a no-op (and working post 1.19 with the mirroring controller) >2. There's no one currently working on anything here >3. We have some ideas in the future, but haven't written any of them down. > >I think we should > >/close > >this issue, and open new ones when we have concrete proposals where using EndpointSlices directly might help. Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
dprotaso commented 1 month ago

/reopen

Users are hitting the 1000 endpoints limit so mirroring isn't working properly - https://cloud-native.slack.com/archives/C04LMU0AX60/p1726054917101579

I think switching over to EndpointSlices won't work with clusters using kubedns (which I believe GKE is using)

@robscott is there any plans for GKE to move to CoreDNS or CloudDNS by default?

knative-prow[bot] commented 1 month ago

@dprotaso: Reopened this issue.

In response to [this](https://github.com/knative/serving/issues/7701#issuecomment-2361787780): >/reopen > >Users are hitting the 1000 endpoints limit so mirroring isn't working properly - https://cloud-native.slack.com/archives/C04LMU0AX60/p1726054917101579 > >I think switching over to EndpointSlices won't work with clusters using kubedns (which I believe GKE is using) > >@robscott is there any plans for GKE to move to CoreDNS or CloudDNS by default? Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.
dprotaso commented 1 month ago

We maybe be able to work around this by creating multiple subsets - see the docs here

https://kubernetes.io/docs/concepts/services-networking/endpoint-slices/#endpointslice-mirroring

robscott commented 1 month ago

@robscott is there any plans for GKE to move to CoreDNS or CloudDNS by default?

We recommend using Cloud DNS in GKE which does support EndpointSlices and is the default in at least Autopilot clusters (not sure about other modes).