Open dprotaso opened 1 year ago
+1 to this proposal.
This has been asked by users of Envoy Gateway to make their runtime deployments more efficient (currently a Gateway
resource maps to a unique Data Plane (Envoy Proxy) deployment)
The spec says compatible listeners can be collapsed, but since Gateways
can be applied dynamically
it is bound to cause traffic disruption in the data plane if the implementation collapses or separates listener implementations in runtime.
If there is way to add a knob which sets constraints for the L3/L4 fields of the Listener
such as ensuring a port must be unique across all Listeners
within all Gateways
within a GatewayClass
, statically collapsing listeners in implementations should be doable.
Below is the current spec
https://github.com/kubernetes-sigs/gateway-api/blob/0177808c57566502f8fb697abd6d42c06008ff6a/apis/v1beta1/gateway_types.go#L72
FWIW how we do this in Istio is https://istio.io/latest/docs/tasks/traffic-management/ingress/gateway-api/#manual-deployment. Basically the same way to apply 1 Gateway to a manually deployed Deployment/Service instead of automatically deployed one can be used to apply N Gateways and merge them.
If they want automated deployment and merging, they would make one "root" Gateway and attach all others. Actually that is another use case for https://github.com/kubernetes-sigs/gateway-api/pull/1596
@howardjohn conformance tests dont pass when the implementation merges the listeners, https://github.com/envoyproxy/gateway/issues/349 has more context
I don't think something implicitly deciding to collapse or not is a great idea which is the root cause of that. In the Istio way users have to explicitly opt into merging, which the conformance tests don't do, so we don't have an issue.
I don't think something implicitly deciding to collapse or not is a great idea which is the root cause of that. In the Istio way users have to explicitly opt into merging, which the conformance tests don't do, so we don't have an issue.
yeah, Im hoping the intent to merge is defined by Gateway API, else implementations will add their own
@shaneutt curious what does the label priority/awaiting-more-evidence
mean and what's the lifecycle of issues in this sub project?
@dprotaso we're in the midst of trying to better organize things based on priority. Given that 3 people are interested in this one, priority/awaiting-more-evidence
was not really the right fit so I've changed it to priority/important-longterm
. Please note that while these labels can be helpful for us to organize work and releases, they aren't necessarily indicative of the actual order in which things will be done. It's reasonable for someone with keen interest to champion an issue like this and take it straight into in-progress
if they feel strongly about it.
@arkodg
The spec says compatible listeners can be collapsed, but since Gateways can be applied dynamically it is bound to cause traffic disruption in the data plane if the implementation collapses or separates listener implementations in runtime.
Knative uses multiple Kubernetes Services to resolve port conflicts.
Our custom Istio installation has a single k8s deployment serving public traffic and internal traffic. This is because we create one K8s Service with type=LoadBalancer
and the other being type=ClusterIP
pointing to the same deployment but targetting different ports.
Thus if the Gateway resource had a scope
property to define whether gateways were public (to the internet) and local to the cluster then I could forsee merging of
apiVersion: gateway.networking.k8s.io/v1beta1
kind: Gateway
metadata:
name: prod-internal
spec:
gatewayClassName: acme-lb
group: knative
scope: Internal
listeners:
- protocol: HTTP
port: 80
---
apiVersion: gateway.networking.k8s.io/v1beta1
kind: Gateway
metadata:
name: prod-web
spec:
gatewayClassName: acme-lb
group: knative
scope: Public
listeners:
- protocol: HTTP
port: 80
to be done by creating the following K8s Services
apiVersion: v1
kind: Service
metadata:
name: prod-internal
spec:
type: ClusterIP
ports:
- name: http2
port: 80
targetPort: 8081 # Dynamically allocated port
---
apiVersion: v1
kind: Service
metadata:
name: prod-web
spec:
type: LoadBalancer
ports:
- name: http2
port: 80
targetPort: 80 # Dynamically allocated port
Each gateway could return a unique IP in the status.addresses
block.
@dprotaso above case helps merge K8s deployments (improving efficiency), there is also a use case where some users might want to further merge gateways within a single K8s service, reducing the number of cloud LBs needed (ratio of cloud LB to gateways is 1: N)
there is also a use case where some users might want to further merge gateways within a single K8s service
@arkodg Yeah we would benefit from that as well. Curious if you have any details/issues to link about user expectations around this feature?
@dprotaso end users prefer lesser Cloud LBs / Services to reduce the management / operational complexity for managing the IPs, security, observailibity of these components
I think your proposed solution of introducing scope
does solve the problem of merging Services if multiple Gateways have the same value of scope
set.
Now if scope
is of type string
it is allowing the platform admins to create hints for the underlying implementation to group & isolate Gateway
resources into domains in the infrastructure. Isnt that isolation handled at the GatewayClass level today :) ?
I think my example of scope
in https://github.com/kubernetes-sigs/gateway-api/issues/1713#issuecomment-1439184112 shows that the same infrastructure could serve different scope
d Gateways.
Now if scope is of type string it is allowing the platform admins to create hints for the underlying implementation to group & isolate Gateway resources into domains in the infrastructure. Isnt that isolation handled at the GatewayClass level today :) ?
I think a type string
makes sense cause then you can have standard scopes and vendored-prefixed ones.
it is allowing the platform admins to create hints for the underlying implementation to group & isolate Gateway resources into domains in the infrastructure. Isnt that isolation handled at the GatewayClass level today :) ?
I think there are two aspects to this issue
I can see 1. being a distinct property on a Gateway
- In my example I was using the property group
so Gateways would declare themselves part of a set.
For 2. I think this would need to be part of the GatewayClass as it's implementation specific.
yeah thinking more on this, I'm a +1 on either group
or scope
to allow users to bucket Gateways
into domains.
This field can be made more powerful than a Gateway annotation if the implementation / validation webhook in Gateway API ensures all Gateway listeners within a specific group are compatible
I'm very -1 on some sort of group
or scope
field, that's explicitly what GatewayClass is supposed to be doing. It's intended that an implementation be able to reconcile multiple GatewayClasses with slightly different settings, that's why we used the controllerName
field rather than explicitly specifying a single GatewayClass. Any time you think "we need a way to bucket these Gateways", we should be thinking of a GatewayClass thing first.
I think that @howardjohn's ideas in GEP-1762 (#1757) about using Address seem like a pretty viable way to handle both the manually-managed infrastructure use case, and the "how to specify a set of merged Gateways" use case. I had originally anticipated that "merge these Gateways" would be specified as a GatewayClass setting, but it seems like there may be too much shenanigans possible if we do it that way. I'm going to think more about it though.
Related - @youngnick take a look at the discussion in the GEP - https://github.com/kubernetes-sigs/gateway-api/pull/1653
I have yet to update GEP contents will do that later tonight based on the discussion
@dprotaso is this a separate GEP from #1653? Why so much discussion here before someone is assigned?
It is a separate GEP, yes. But I think the eventual solution will be a combination of pieces from #1651 (PR #1653), and #1762 (PR #1757), rather than its own GEP.
@dprotaso is this a separate GEP from https://github.com/kubernetes-sigs/gateway-api/pull/1653?
Yup - this GEP was to help solve the issue of managing/sharding a massive list of spec.listeners
- see the prior discussions here - https://github.com/kubernetes-sigs/gateway-api/discussions/1248, https://github.com/kubernetes-sigs/gateway-api/discussions/1246
I think there are two ways to look at this problem - and they might warrant two GEPs
This GEP is taclking the first and I think #1762 is tackling the second (I have yet to go through it)
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
/lifecycle frozen
FYI - the GEP PR has been updated to support cross-namespace. From my comment https://github.com/kubernetes-sigs/gateway-api/pull/1863#issuecomment-1927794180
I now have a use case that would warrant having a parent in a different namespace. Imagine the cluster operator persona is split into two parts - the cluster operator and the application operator.
The cluster operator provisions a 'parent' Gateway in some system namespace. They're responsible for installing/upgrading the underlying package that implements the Gateway API. They want to ensure costs are kept low so they want their Gateway instance to be shared with app operators. Thus there's only a single LB for the cluster.
The application operator owns a namespace and sets up the namespace to be used by app developers. The app operator will want to define a child Gateway with the right listeners and attaches the 'child' Gateway to 'parent' Gateway mentioned above.
The application developer creates HTTPRoute and attaches them to the child gateway in their namespace.
Some implementations have expressed interest:
thanks for restarting work on this GEP @dprotaso. This has been a popular ask for users in Envoy Gateway, and we finally ended up adding a knob (in our previous v0.6.0 release ) at the GatewayClass level (via the parametersRef custom resource called EnvoyProxy) called MergeGateways . This is primarily because there is a 1:1 mapping between GatewayClass and data plane infrastructure in Envoy Gateway today as all infrastructure specific fields like resources
and replicas
can only be configured via the custom resource thats attached to the GatewayClass
We do need this, especially as several implementations are already doing "something" in this regard.
/triage accepted
However we are targeting early April for v1.1.0
, and in review we don't expect that this will be ready or have had enough soak time for that timeline, so we're looking at v1.2.0
for this currently.
/assign @dprotaso
What would you like to be added:
Support for merging Gateways should have standard & documented mechanic.
Why this is needed:
Knative generates on demand per-service certificates using HTTP-01 challenges. There can be O(1000) Knative Services in the cluster which means we have O(1000) distinct certificates. The Gateway Resource is a contention point since is the only place to attach listeners to gateways with certificates.
The spec currently has language to indicate implemenations
MAY
merge Gateways resources but the mechanic isn't defined. https://github.com/kubernetes-sigs/gateway-api/blob/541e9fc2b3c2f62915cb58dc0ee5e43e4096b3e2/apis/v1beta1/gateway_types.go#L76-L78Secondly, given similar use-cases/needs for control planes to manage listeners it might make sense that
merging
Gateways is part ofextended
conformance.