Open guydc opened 5 months ago
ideally prefer if this was in upstream relates to https://github.com/kubernetes-sigs/gateway-api/discussions/2304 & https://github.com/kubernetes-sigs/gateway-api/issues/1802
if we add into field into non xRoute backendRefs, we may need to pluraize existing instantiations to backendRefs
if we add into field into non xRoute backendRefs, we may need to pluralize existing instantiations to
backendRefs
Yes, also ext-proc. I don't object, and this is in-line with gateway-api. Do you think that we should consider a different intermediate approach?
I'd vote to pluraize the backendRefs for non xRoute asap while we are in alpha, which can be a good place to test the priority
/ standby use case
This issue has been automatically marked as stale because it has not had activity in the last 30 days.
This could be used to solve https://github.com/envoyproxy/gateway/issues/1909, though it may require further refinement. Specifically, the Envoy Gateway should allocate distinct priority levels to the same upstream endpoint, varying by the Envoy Pod’s location in different availability zones. For instance, for an identical endpoint in zone A, the Envoy Pod situated in zone A should be assigned priority 0, whereas Envoy Pods in zones B and C should be assigned priority 1.
Do you think Envoy Gateway will handle this? Or leave it to the user?
Do you think Envoy Gateway will handle this? Or leave it to the user?
We briefly discussed this scenario in the community meeting. It seems like this will require pushing different config to different envoy proxies based on zone information. For the time being, we'll focus on supporting the simpler use case described above.
As mentioned in https://github.com/envoyproxy/gateway/issues/1909#issuecomment-2126782250, Envoy in different zones may use same backend endpoint as different priority to reduce cross-zone cost.
It's a required feature for production adoption. Does community have roadmap on this feature ?
https://github.com/envoyproxy/gateway/issues/1909#issuecomment-2128073855 says:
thanks for outlining the steps @aoledk ! we currently have #3055 open to get explicit priority per backendRef and program that into the xds cluster resource.
In the future, we can use this issue to make sure we track the auto priority work, the field in k8s
preferClose
could be the knob for users to say they want to opt in to this feature
@arkodg One backendRef (mainly Service) will refer to endpoints from multiple zones, priority should not be applied on backendRef, unless we use selector to collect endpoints in same zone into single backendRef.
Besides, same endpoint will have different priority in Envoy.
So maybe we can't assign a fixed priority for backendRef, codes will be needed to assign right priority for specific Envoy.
there are two use cases for priority
zones
represent physical locations (done for HA purposes at the node level) so location aware routing reduces overall latency as well as cost (cross zone or region can be billed by cloud provider)@arkodg Can we add a new xDS Hook API, like PostEndpointModify(ClusterLoadAssignment, Node)
which allow extension server to assign endpoint priority for individual envoy. Leave these tasks to EG adopters themselves.
hey @aoledk thanks for driving this work, before we implement this, we as a community need to align on what the feature is, is it
Auto Zone Aware Routing where a user sets a global field to enable this, and envoy proxy prioritizes endpoints in the same zone based on its zone ? Open Questions:
- is this granular enough ? there are only 2 levels here -
local
or not local- Priorites for BackendRefs - Prioritizes one backendRef over another backendRef within the same HTTP Rule
@arkodg I've quoted your https://github.com/envoyproxy/gateway/issues/3538#issuecomment-2153387620 here to continue discussion about priority. I'll freeze #3542 until we reach agreement.
Assuming two cases demonstrated in image below.
Case A: priority isn't in gateway-api, EG will generate #1
, #2
LocalityLbEndpoints.
Case B: priority is in gateway-api, EG will generate #3
, #4
, #5
, #6
LocalityLbEndpoints.
Code in xds/cache
will be used to modify priority for individual envoy based on envoy's running topology.
We can have 2 options to enable topology aware routing ( NOT envoy zone aware routing ) based on priority.
Add knob into BackendTrafficPolicy to enable topology aware routing for affected BackendRefs.
If we enable in Case A, #1
, #2
LocalityLbEndpoints for envoy running in zone a will be modified by xds/cache
like this:
#1
P=0#2
P=1If we enable in Case B, #3
, #4
, #5
, #6
LocalityLbEndpoints for envoy running in zone a will be modified by xds/cache
like this:
#3
P=0#4
P=1#5
P=2#6
P=3But this option has several challenges.
The new knob is accessible in xds/translator
, but xds/translator
only pass xds
to xds/server
, then to xds/cache
, a new communication channel need to be setup to pass this particular knob to xds/cache
. It will add additional dependency between those modules.
Should we allow user to enable topology aware routing for BackendRefs specified with priority ?
If yes, that means xds/cache
will modify LocalityLbEndpoints's priority specified by user.
Add knob into EnvoyGateway to enable topology aware routing for all BackendRefs.
If enabled, #1
, #2
, #3
, #4
, #5
, #6
LocalityLbEndpoints for envoy running in zone a will be modified by xds/cache
like this:
#1
P=0#2
P=1#3
P=0#4
P=1#5
P=2#6
P=3Same to Option 1's Challenge 2.
is this granular enough ? there are only 2 levels here -
local
or not local
LocalityLbEndpoints can be set with region
/ zone
/ sub_zone
:
zone
can be retrieved from EndpointSlice directly.region
can be retrieved like this: use nodeName in EndpointSlice to query Node, retrieve region from Node topology.kubernetes.io/region
label.Envoy can be set with region
/ zone
/ sub_zone
.
region
and zone
can be retrieved like this: use nodeName via Pod downward API to query Node, retrieve region and zone from Node topology.kubernetes.io/region
and topology.kubernetes.io/zone
labels.sub_zone
can't be retrieved.So region
can be used as level bigger than zone
technically, we can discuss whether we need to use region
on topology aware routing.
thanks for the detailed analysis @aoledk ! 1) looks like the "priority is in gateway api backendRef" case is simpler and straightforward and doesnt need zone based endpoint aggregation
Auto Zone Aware Routing 2) if we okay with 2 level priorities then envoy's inherent zone aware routing feature is the way to go Pros
region
/ zone
/ sub zone
, we'll have to key EDS o/p in the control plane based on envoy locality and start maintaining this extra dimension
ProsAs next steps, here is what I propose
This issue has been automatically marked as stale because it has not had activity in the last 30 days.
thanks for the detailed analysis @aoledk !
- looks like the "priority is in gateway api backendRef" case is simpler and straightforward and doesnt need zone based endpoint aggregation
Auto Zone Aware Routing 2) if we okay with 2 level priorities then envoy's inherent zone aware routing feature is the way to go Pros
- easier to build / less complex to maintain and debug Cons
- only 2 levels of priority - local to envoy or not
- if we want multi level priority based on
region
/zone
/sub zone
, we'll have to key EDS o/p in the control plane based on envoy locality and start maintaining this extra dimension Pros
- more levels of priority Cons
- need to maintain per locality endpoint cache
As next steps, here is what I propose
- Decide on which feature we want to prioritize first or both ? This is a great roadmap item for v1.2 (Oct 2024)
- For Zone Aware Routing gather more feedback on how many levels of priority / locality we want to support
1 sounds straight forward and can bring value, we can extend this struct with an endpoint priority field: https://github.com/envoyproxy/gateway/blob/a43cc6c96ac147ce58db9714283e77d237c3d1e3/api/v1alpha1/shared_types.go#L471
Similar to how gwapiv1.BackendObjectReference has "weight". Then, translate this value to an envoy endpoint priority: https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/endpoint/v3/endpoint_components.proto#envoy-v3-api-msg-config-endpoint-v3-localitylbendpoints
We can also have an E2E test for this, to demonstrate that when a high-prio backend is down, traffic is steered to the low prio backend.
This will only work for "technical" backends at the moment, like ext-auth/proc/observability sinks ...
If the direction sounds good, I can get started with adding it?
hey @alexwo thanks for lookiing into this !
im hoping we can weigh in pros/cons of an explicit priority
field vs adding another field like backupBackendRef
note for @aoledk who is looking into Zone Aware Routing, priority routing is mutually exclusive to it, and EG should be able to do zone aware routing for P=0 Hosts
GCP expresses failover using a bool failover
field inside backend definition
https://cloud.google.com/compute/docs/reference/rest/v1/backendServices
GCP expresses failover using a
bool failover
field inside backend definition https://cloud.google.com/compute/docs/reference/rest/v1/backendServices
Sure, we can simplify and offer the same level of flexibility as done in GCP. So that via the API layer we expose just a boolean and manage as priority at the translation layer, allowing for a priority of: primary / backup (0,1).
nginx has a backup
keyword It uses https://nginx.org/en/docs/http/ngx_http_upstream_module.html
I'm a +1 for the boolean option
also @alexwo another design decision is around when does an endpoint get removed from the load balancing pool and when does it get re added ? is the user responsible for setting up healthCheck
(active or passive) which results in endpoints being removed/readded from the pool ?
+1 for a boolean as well.
when does an endpoint get removed from the load balancing pool and when does it get re added
- I imagine that K8s service endpoints would be added/removed automatically based on K8s pod probes and status. Users can additionally opt-in and define active/passive healthchecks using backend traffic policy. This is an envoy recommendation for EDS: https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/upstream/service_discovery#on-eventually-consistent-service-discovery
- For non-service backends (Unix Socket, FQDN, ...) - users must to define active/passive healthchecks in order for priorities to actually have an impact.
Sounds good @arkodg / @guydc,
I have updated: https://github.com/envoyproxy/gateway/pull/4033 to reflect the “designated failover backend” boolean approach.
Description: Envoy supports routing priorities:
This allows users to configure a failover mechanism from primary to secondary endpoints.
Envoy Gateway currently configures a default priority for all endpoints: https://github.com/envoyproxy/gateway/blob/f9409e477e866a012757bb9eb4cbc9ac35c7f3a7/internal/xds/translator/cluster.go#L408
LB Priority can be implemented by extending existing Gateway-API resource:
BackendRefs
may support a priority field, similar to the existing weight field.BackendLBPolicy
resource may be extended to support additional LB considerations, such as priorty.Other options:
BackendTrafficPolicy
can be used to assign priorities. For example, the policy will hold a map of BackendRefs to priorities. Or, the policy can attach directly to Service resources.