kubernetes-sigs / gateway-api

Repository for the next iteration of composite service (e.g. Ingress) and load balancing APIs.
https://gateway-api.sigs.k8s.io
Apache License 2.0
1.78k stars 461 forks source link

GEP: Standard Mechanism to Merge Multiple Gateways #1713

Open dprotaso opened 1 year ago

dprotaso commented 1 year ago

What would you like to be added:

Support for merging Gateways should have standard & documented mechanic.

Why this is needed:

Prior Discussions: https://github.com/kubernetes-sigs/gateway-api/discussions/1248, https://github.com/kubernetes-sigs/gateway-api/discussions/1246

Knative generates on demand per-service certificates using HTTP-01 challenges. There can be O(1000) Knative Services in the cluster which means we have O(1000) distinct certificates. The Gateway Resource is a contention point since is the only place to attach listeners to gateways with certificates.

The spec currently has language to indicate implemenations MAY merge Gateways resources but the mechanic isn't defined. https://github.com/kubernetes-sigs/gateway-api/blob/541e9fc2b3c2f62915cb58dc0ee5e43e4096b3e2/apis/v1beta1/gateway_types.go#L76-L78

Secondly, given similar use-cases/needs for control planes to manage listeners it might make sense that merging Gateways is part of extended conformance.

arkodg commented 1 year ago

+1 to this proposal.

howardjohn commented 1 year ago

FWIW how we do this in Istio is https://istio.io/latest/docs/tasks/traffic-management/ingress/gateway-api/#manual-deployment. Basically the same way to apply 1 Gateway to a manually deployed Deployment/Service instead of automatically deployed one can be used to apply N Gateways and merge them.

If they want automated deployment and merging, they would make one "root" Gateway and attach all others. Actually that is another use case for https://github.com/kubernetes-sigs/gateway-api/pull/1596

arkodg commented 1 year ago

@howardjohn conformance tests dont pass when the implementation merges the listeners, https://github.com/envoyproxy/gateway/issues/349 has more context

howardjohn commented 1 year ago

I don't think something implicitly deciding to collapse or not is a great idea which is the root cause of that. In the Istio way users have to explicitly opt into merging, which the conformance tests don't do, so we don't have an issue.

arkodg commented 1 year ago

I don't think something implicitly deciding to collapse or not is a great idea which is the root cause of that. In the Istio way users have to explicitly opt into merging, which the conformance tests don't do, so we don't have an issue.

yeah, Im hoping the intent to merge is defined by Gateway API, else implementations will add their own

dprotaso commented 1 year ago

@shaneutt curious what does the label priority/awaiting-more-evidence mean and what's the lifecycle of issues in this sub project?

shaneutt commented 1 year ago

@dprotaso we're in the midst of trying to better organize things based on priority. Given that 3 people are interested in this one, priority/awaiting-more-evidence was not really the right fit so I've changed it to priority/important-longterm. Please note that while these labels can be helpful for us to organize work and releases, they aren't necessarily indicative of the actual order in which things will be done. It's reasonable for someone with keen interest to champion an issue like this and take it straight into in-progress if they feel strongly about it.

dprotaso commented 1 year ago

@arkodg

The spec says compatible listeners can be collapsed, but since Gateways can be applied dynamically it is bound to cause traffic disruption in the data plane if the implementation collapses or separates listener implementations in runtime.

Knative uses multiple Kubernetes Services to resolve port conflicts.

Our custom Istio installation has a single k8s deployment serving public traffic and internal traffic. This is because we create one K8s Service with type=LoadBalancer and the other being type=ClusterIP pointing to the same deployment but targetting different ports.

Thus if the Gateway resource had a scope property to define whether gateways were public (to the internet) and local to the cluster then I could forsee merging of

apiVersion: gateway.networking.k8s.io/v1beta1
kind: Gateway
metadata:
  name: prod-internal
spec:
  gatewayClassName: acme-lb
  group: knative
  scope: Internal
  listeners:  
  - protocol: HTTP
    port: 80
---
apiVersion: gateway.networking.k8s.io/v1beta1
kind: Gateway
metadata:
  name: prod-web
spec:
  gatewayClassName: acme-lb
  group: knative
  scope: Public
  listeners:  
  - protocol: HTTP
    port: 80

to be done by creating the following K8s Services

apiVersion: v1
kind: Service
metadata:
   name: prod-internal
spec:
  type: ClusterIP
  ports:
    - name: http2
      port: 80
      targetPort: 8081 # Dynamically allocated port
---
apiVersion: v1
kind: Service
metadata:
   name: prod-web
spec:
  type: LoadBalancer
  ports:
    - name: http2
      port: 80
      targetPort: 80 # Dynamically allocated port

Each gateway could return a unique IP in the status.addresses block.

arkodg commented 1 year ago

@dprotaso above case helps merge K8s deployments (improving efficiency), there is also a use case where some users might want to further merge gateways within a single K8s service, reducing the number of cloud LBs needed (ratio of cloud LB to gateways is 1: N)

dprotaso commented 1 year ago

there is also a use case where some users might want to further merge gateways within a single K8s service

@arkodg Yeah we would benefit from that as well. Curious if you have any details/issues to link about user expectations around this feature?

arkodg commented 1 year ago

@dprotaso end users prefer lesser Cloud LBs / Services to reduce the management / operational complexity for managing the IPs, security, observailibity of these components

I think your proposed solution of introducing scope does solve the problem of merging Services if multiple Gateways have the same value of scope set.

Now if scope is of type string it is allowing the platform admins to create hints for the underlying implementation to group & isolate Gateway resources into domains in the infrastructure. Isnt that isolation handled at the GatewayClass level today :) ?

dprotaso commented 1 year ago

I think my example of scope in https://github.com/kubernetes-sigs/gateway-api/issues/1713#issuecomment-1439184112 shows that the same infrastructure could serve different scoped Gateways.

Now if scope is of type string it is allowing the platform admins to create hints for the underlying implementation to group & isolate Gateway resources into domains in the infrastructure. Isnt that isolation handled at the GatewayClass level today :) ?

I think a type string makes sense cause then you can have standard scopes and vendored-prefixed ones.

it is allowing the platform admins to create hints for the underlying implementation to group & isolate Gateway resources into domains in the infrastructure. Isnt that isolation handled at the GatewayClass level today :) ?

I think there are two aspects to this issue

  1. When creating a Gateway how do you hint it should be merged with another (this helps scale Gateway/listener management)
  2. How do you map Gateways onto infrastructure.

I can see 1. being a distinct property on a Gateway - In my example I was using the property group so Gateways would declare themselves part of a set.

For 2. I think this would need to be part of the GatewayClass as it's implementation specific.

arkodg commented 1 year ago

yeah thinking more on this, I'm a +1 on either group or scope to allow users to bucket Gateways into domains. This field can be made more powerful than a Gateway annotation if the implementation / validation webhook in Gateway API ensures all Gateway listeners within a specific group are compatible

youngnick commented 1 year ago

I'm very -1 on some sort of group or scope field, that's explicitly what GatewayClass is supposed to be doing. It's intended that an implementation be able to reconcile multiple GatewayClasses with slightly different settings, that's why we used the controllerName field rather than explicitly specifying a single GatewayClass. Any time you think "we need a way to bucket these Gateways", we should be thinking of a GatewayClass thing first.

I think that @howardjohn's ideas in GEP-1762 (#1757) about using Address seem like a pretty viable way to handle both the manually-managed infrastructure use case, and the "how to specify a set of merged Gateways" use case. I had originally anticipated that "merge these Gateways" would be specified as a GatewayClass setting, but it seems like there may be too much shenanigans possible if we do it that way. I'm going to think more about it though.

dprotaso commented 1 year ago

Related - @youngnick take a look at the discussion in the GEP - https://github.com/kubernetes-sigs/gateway-api/pull/1653

I have yet to update GEP contents will do that later tonight based on the discussion

candita commented 1 year ago

@dprotaso is this a separate GEP from #1653? Why so much discussion here before someone is assigned?

youngnick commented 1 year ago

It is a separate GEP, yes. But I think the eventual solution will be a combination of pieces from #1651 (PR #1653), and #1762 (PR #1757), rather than its own GEP.

dprotaso commented 1 year ago

@dprotaso is this a separate GEP from https://github.com/kubernetes-sigs/gateway-api/pull/1653?

Yup - this GEP was to help solve the issue of managing/sharding a massive list of spec.listeners - see the prior discussions here - https://github.com/kubernetes-sigs/gateway-api/discussions/1248, https://github.com/kubernetes-sigs/gateway-api/discussions/1246

I think there are two ways to look at this problem - and they might warrant two GEPs

  1. As an end-user how do I logically merge different Gateways
  2. As an end-user how do I map Gateways to infrastructure

This GEP is taclking the first and I think #1762 is tackling the second (I have yet to go through it)

k8s-triage-robot commented 8 months ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

youngnick commented 8 months ago

/remove-lifecycle stale

/lifecycle frozen

dprotaso commented 8 months ago

FYI - the GEP PR has been updated to support cross-namespace. From my comment https://github.com/kubernetes-sigs/gateway-api/pull/1863#issuecomment-1927794180

I now have a use case that would warrant having a parent in a different namespace. Imagine the cluster operator persona is split into two parts - the cluster operator and the application operator.

The cluster operator provisions a 'parent' Gateway in some system namespace. They're responsible for installing/upgrading the underlying package that implements the Gateway API. They want to ensure costs are kept low so they want their Gateway instance to be shared with app operators. Thus there's only a single LB for the cluster.

The application operator owns a namespace and sets up the namespace to be used by app developers. The app operator will want to define a child Gateway with the right listeners and attaches the 'child' Gateway to 'parent' Gateway mentioned above.

The application developer creates HTTPRoute and attaches them to the child gateway in their namespace.

Some implementations have expressed interest:

arkodg commented 8 months ago

thanks for restarting work on this GEP @dprotaso. This has been a popular ask for users in Envoy Gateway, and we finally ended up adding a knob (in our previous v0.6.0 release ) at the GatewayClass level (via the parametersRef custom resource called EnvoyProxy) called MergeGateways . This is primarily because there is a 1:1 mapping between GatewayClass and data plane infrastructure in Envoy Gateway today as all infrastructure specific fields like resources and replicas can only be configured via the custom resource thats attached to the GatewayClass

shaneutt commented 6 months ago

We do need this, especially as several implementations are already doing "something" in this regard.

/triage accepted

However we are targeting early April for v1.1.0, and in review we don't expect that this will be ready or have had enough soak time for that timeline, so we're looking at v1.2.0 for this currently.

dprotaso commented 4 months ago

/assign @dprotaso