kubernetes / kubernetes

Production-Grade Container Scheduling and Management
https://kubernetes.io
Apache License 2.0
109.65k stars 39.27k forks source link

Enhance Egress Configuration for Finer Grain Control Over APIService and Webhook traffic #119002

Open cheftako opened 1 year ago

cheftako commented 1 year ago

What would you like to be added?

The hypershift project would love to see the ability to control at a finer level what traffic goes through this component in the egress config as well.

The environment to think about is a split control plane/data plane environment where a set of webhooks/apiservices are in a colocated network with the kube-apiserver (and therefore have direct IP connectivity to those webhooks/apiservices. However: there are a subset of "in cluster/data plane" apiservices/webhooks that will need to proxy through konnectivity in order to hit the appropriate endpoint (in cluster services like metrics apiservice, in cluster webhooks like istio, etc).

Right now it's an all or nothing approach for specifying if webhook and apiservice traffic goes through konnectivity or direct. To work around this: a user has to deploy agents in both the "control plane network" that advertise the set of ips for the webhook traffic and then traffic has to go kube-apiserver -> konnectivity server -> konnectivity agent -> webhook backend even though it can optimally go kube-apiserver -> webhook backend

Current workaround is the konnectivity agents in the control plane network advertises the "subset" of ips that exist in the control plane: and then the suite of konnectivity agents deployed in the cluster advertise a "default route" for all other ips.

Considerations: Typically in these modes a "provider" will be in control of services within the same network as the Kube-Apiserver (providers that provide a managed Kubernetes service). Therefore I think the definition of services within that network should be controlled by the configuration file fed to the Kubernetes APIServer versus something "dynamic" a client could do to a APIService or webhook (like a label, etc). I think it would be safe to assume that if it is not in that explicit "allowlist" then it could proceed to be processed by Konnectivity and the associated system.

General things to consider: How would a possible "allow list" look like? What current data does the kube-apiserver have context wise before calling a webhook/apiservice (which then is processed by the egress config? How can that be processed against a egress config and ultimately make a decision to "send direct" or "send through the proxy"?

In case it helps to know the work around typically used today: APIServices typically point to a "Service within the cluster" that is a "headless" service with no actual endpoint IP. An endpoint IP is defined to the "VIP" that fronts that APIService in the control plane network. Once there: the Kube-Apiserver looks up that endpoint IP and issues the request to konnectivity based off that VIP. A konnectivity-agent in the control-plane network advertises that it is the specific agent to process traffic for that IP. It is proxied to that agent within the same control plane network which then ultimately just sends the traffic to that VIP. All traffic not to that explicit list of VIPs goes to konnectivity-agents in the cluster that advertise a "default route" meaning process all traffic not destined to agents advertising specific IPs. Default route agents all live in the "data plane" network that is separate from the control plane network

Why is this needed?

Allows finer grained control of KAS egress traffic.

cheftako commented 1 year ago

Originates from https://github.com/kubernetes-sigs/apiserver-network-proxy/issues/496

cheftako commented 1 year ago

/cc @relyt0925 @deads2k @jpbetz @fedebongio @jkh52

cheftako commented 1 year ago

/sig api-machinery

cheftako commented 1 year ago

https://github.com/kubernetes/enhancements/pull/4048

jiahuif commented 1 year ago

/triage accepted

tallclair commented 11 months ago

I deduped another issue to this one, but the presented use case is slightly different so I'll paste it here:

Currently, the egress network selection logic always uses the cluster network when the webhook is specified by service, and always uses the control plane network when the webhook is specified by url.

This can cause problems with webhooks that are hosted externally to the cluster & control plane. For example, suppose for security reasons you want to only allow egress to a very limited set of host:ports in the control plane network.

briankhoi commented 2 months ago

/assign