Closed sunjayBhatia closed 1 year ago
@sunjayBhatia thanks for sharing all the research on this topic ! The EG community too needs a decide on which extension point to use for the supporting advanced API Gateway features. I'm sharing some options here, so the community can voice their preference.
Use HTTPRoute Filters for all API Gateway features
kind: RateLimit
apiVersion: gateway.envoyproxy.io/v1alpha1
metadata:
name: rl-example
spec:
---
kind: HTTPRoute
apiVersion: gateway.networking.k8s.io/v1beta1
spec:
parentRefs:
- group: gateway.networking.k8s.io
kind: Gateway
.....
hostnames:
- "example.com"
rules:
- matches:
- path:
type: PathPrefix
value: /
backendRefs:
- kind: Service
name: example
port: 8080
filters:
- type: ExtensionRef
extensionRef:
group: gateway.envoyproxy.io
kind: RateLimit
name: rl-example
Use PolicyAttachment for all API Gateway features
kind: RateLimit
apiVersion: gateway.envoyproxy.io
metadata:
name: rl-policy
namespace: demo
spec:
targetRef:
group: gateway.networking.k8s.io
kind: Gateway
name: example
override:
....
default:
...
@skriss @danehans @youngnick @AliceProxy @LukeShu @Xunzhuo can you please share your thoughts and preferences on this issue, please also consider sharing alternative options.
Specifically targeting #1, this comment is relevant: https://github.com/envoyproxy/gateway/pull/529#discussion_r1012164693
Not all configuration can be translated to Envoy config from per-Route configuration, due to the nature of particular Envoy filters and how Envoy Gateway's xDS translation logic may set up Listeners/Virtualhosts/Routes
In addition to thinking about how things will be implemented, we should consider in this decision what the intended UX we want is, which involves what resources/personas/scopes we want to target:
Prefer 2 because
default
and override
is powerful, addresses persona hierarchyGateway
but Filters cannot be applied to the Gateway
Imho 3 adds cognitive load on the user and
1 cannot be applied to Gateway
as shared above
This has been discussed in the authn policy document as well in this comment. As a result of a brief community call we agreed on something like 2. We might want to add some sort of support with PolicyAttachment to select routes in an HTTPRoute though.
Also note that Filters in HTTPRoute does not directly translate to Envoy HTTP filters as they are configured at different scope and not all Envoy filters support per route basis and the support varies. The Envoy HTTP filters are created per HCM filter chain.
@arkodg I feel like https://github.com/envoyproxy/gateway/issues/675#issuecomment-1301376173 is a repeat of https://github.com/envoyproxy/gateway/issues/677. Can you please clarify?
the notion of default and override is powerful, addresses persona hierarchy
As stated upstream, this can become very complicated which leads to poor UX.
From the draft Gateway API patterns: Policy Attachment and Metaresources doc authored by @youngnick :
Policy Attachment: It’s all about the defaults and overrides
The current Policy Attachment docs list three primary components of the Policy Attachment pattern:
- A standardized means of attaching policy to resources.
- Support for configuring both default and override values within policy resources.
- A hierarchy to illustrate how default and override values should interact.
I think that we should focus “Policy Attachment” around the latter two bullet points
If Policy Attachment should not be used as a standardized means of attaching a policy to resources
, then a different mechanism should be used. Let's refer to #529 as a concrete example. A dev creates an HTTPRoute to route traffic to two different backend services. The dev now wishes to attach different JWT AuthN policies to the backend services. A resource should be used to attach these policies. First, the dev creates the AuthN resource, for example:
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: AuthN
metadata:
name: svc1-jwt-auth
spec:
authType: JWT
jwtConfig:
<JWT config for service 1>
The dev repeats the above for service 2 and then attaches the AuthN resource to the HTTPRoute as an implementation-specific filter:
apiVersion: gateway.networking.k8s.io/v1beta1
kind: HTTPRoute
metadata:
name: pol-attach-example
spec:
parentRefs:
- name: eg
hostnames:
- www.example.com
rules:
- matches:
- path:
type: PathPrefix
value: /svc1
filters:
- type: ExtensionRef
extensionRef:
group: gateway.envoyproxy.io
kind: AuthN
name: svc1-jwt-auth
backendRefs:
- name: svc1
port: 80
- matches:
- path:
type: PathPrefix
value: /svc2
filters:
- type: ExtensionRef
extensionRef:
group: gateway.envoyproxy.io
kind: AuthN
name: svc2-jwt-auth
backendRefs:
- name: svc2
port: 80
Traffic is now authN'd to the two backend services. However, the SecOps team from example corp has more stringent JWT authN requirements. The cluster admin is now required to enforce minimum JWT authN settings and creates an AuthnPolicy that overrides the settings for all AuthN kinds. For example:
apiVersion: gateway.envoyproxy.io
kind: AuthnPolicy
metadata:
name: svc1-jwt-auth
spec:
targetRef:
group: gateway.networking.k8s.io
kind: AuthN
override:
jwtConfig:
<INSERT_OVERRIDES>
The JWT authN config for both backend services is now overridden.
Although this approach requires two kinds, AuthN and AuthnPolicy, it does provide support for multiple personas inherent in Gateway API. RBAC is properly configured so that Devs are free to configure AuthN kinds for traffic routing and infra admins can configure AuthnPolicy to override traffic routing.
@danehans thanks for raising this case about using two separate CRDs here - Authn
and AuthnPolicy
. Here are some follow up questions
AuthnPolicy
to the HTTPRoute
, shouldnt we be able to achieve the same outcome with 1 CRD type AuthnPolicy
?relevant discussion topic for this as well, given some of the documentation in the Policy Attachment reference: https://github.com/kubernetes-sigs/gateway-api/discussions/1503
currently Filters and Policies aren't "supposed" to overlap, but as you point out and I think others have realized as well, this pattern seems useful
The poor UX issue will be addressed in upstream, I think that option 2 sounds about right here.
If the App Dev had the RBAC to apply AuthnPolicy to the HTTPRoute, shouldnt we be able to achieve the same outcome with 1 CRD type AuthnPolicy
Providing App Devs the ability to CRUD AuthNPolicy
would defeat the purpose of separating concerns. App Devs should not be allowed to set or override default policies, e.g. AuthNPolicy
. If we need the ability for App Devs to define authN for API endpoints and cluster admins to override/default authN, then separate resources are required.
assigning myself this issue, so that I can help moderate this discussion to a decision.
Raising some further questions around users and features to help drive the debate Assumptions We are deciding extensions for these features
Questions around Capability
Do we want these features to have overrides by other personas
Questions around Experience
Okay, I think I wasn't clear enough in my answer before, so let me clarify.
When talking about these features, we have a few extension points that can be used. For Auth and Ratelimiting, it conceptually makes sense that those be a filter that modifies a HTTPRoute match before it's sent on to the backends. That allows for very specific targeting of routes, headers, methods etc for these filters.
But, the downside of using a filter in that way is that then, the filter currently has to be specified for every place where it needs to be used. So if the cluster admin/Envoy Gateway owner wants to specify that all routes must have a global ratelimit, or a certain type of auth, then not only can they not do that, it's on the owner of each HTTPRoute to actually remember to.
So, what I'm going to be proposing in upstream is that there should be a class of Policy Attachment objects that modify the way these custom filters work, either defaulting or overriding the settings.
So, in this way, as a Gateway owner, I could attach a RatelimitingPolicy object to the Gateway, and set the default
to be to use a specific ratelimit policy extensionref CRD, just as if every HTTPRoute rule attached to the Gateway specified the policy. This would mean that HTTPRoute owners would be able to override the default, but specifying their own extensionRef CRD.
Or, as a Gateway owner, I could attach a JWTAuthPolicy to the Gateway, that sets an override
to use a specific JWTAuth CRD extensionRef, which an HTTPRoute owner cannot override by setting their own auth settings. (Would you actually want to do this for a large Gateway? Probably not, but that's what it's for).
So, that's how we could combine a custom filter and Policy objects to give both flexibility and control to users. Note that both of these examples require us to build a custom Filter first, then the Policy object to allow it to be overridden.
Alternatively, we could _not_use the extensionRef
filters, and only define these functions in terms of Policy objects, and instead of having the ability to define a filter inline in the HTTPRoute YAML, just have these settings attached via a Policy.
Both of the Policy approaches do have the UX limitation that it's very difficult to know if a Policy is augmenting the settings at all - part of the Policy rewrite I'm working on upstream is about trying to design a status section that should make this more clear. (It's a bit tricky though).
So, overall, I'm for Option 3 - which is "use both where necessary", but that's not very helpful right now.
I think that in terms of experimenting and trying things out, a custom filter for both Ratelimiting and Auth are both less design work, and less conceptual overhead for users to understand, at the cost of increased verbosity. I'd recommend using those for now, which puts me actually partially in the Option 1 camp, for now. I think that as I get these changes designed and pushed into upstream, this whole situation should hopefully become clearer.
@youngnick are there any plans on also supporting extensionRef
in the Gateway API ? else from a capability perspective
EG will be unable to provide support for extended features at a connection / listener level with just extensionRef
filters at the HTTPRoute level
But, the downside of using a filter in that way is that then, the filter currently has to be specified for every place where it needs to be used. So if the cluster admin/Envoy Gateway owner wants to specify that all routes must have a global ratelimit, or a certain type of auth, then not only can they not do that, it's on the owner of each HTTPRoute to actually remember to.
Thanks for clarifying. From reviewing your Policy Attachment doc, this was my understanding and why #529 is designed as an implementation-specific filter. I anticipate EG supporting an AuthenticationPolicy resource in the future that would allow admins to override and/or set default settings defined by an Authentication resource.
Both of the Policy approaches do have the UX limitation that it's very difficult to know if a Policy is augmenting the settings at all - part of the Policy rewrite I'm working on upstream is about trying to design a status section that should make this more clear. (It's a bit tricky though).
This is one of my concerns with Policy Attachment and highlighted by @sunjayBhatia in https://github.com/kubernetes-sigs/gateway-api/discussions/1503.
here's my current take based on three pillars
Gateway
as well as HTTPRoute
allowing attributes to be applied at
Layer 4-6 (connection) level as well as Layer 7 (request)Gateway
Gateway
and can also be applied by the application developer on the HTTPRoute
HTTPRoute
HTTPRoute
, it is applied to all rules within that route, which might not be the intent. The workaround is to create multiple small HTTPRoutes instead, not ideal, but not end of the world either.Gateway
and HTTPRoute
the final Policy output will look like
O/P = [Merge (left to right) ] Gateway defaults
+ HTTPRoute defaults
+ HTTPRoute override
+ Gateway override
.
The final O/P might not be trivial for the HTTPRoute owner / application developer to compute to help them write intent, so implementors are left to surface such o/p in the Status
field (thanks for outlining this here @sunjayBhatia ). Again this should be fine, but standardizing this upstream would be great !rule
within HTTPRoute
, not ideal, but not the end of the world either.Based on this, I am revoting for option 3 (case by case basis). I think Policy
should be considered as the default choice of extension, and a Filter
should be considered if ALL of the below are true
security
should be considered privileged) that does not need to be applied by a privileged persona / platform admin on a privileged resource / Gateway
.The feature is not privileged (anything relating to security should be considered privileged) that does not need to be applied by a privileged persona / platform admin on a privileged resource / Gateway.
We have already heard feedback from @jcheld for the desire to allow App devs to configure auth for endpoints (HTTPRoute filter) but allow cluster admins to override (Policy Attachment). @jcheld please correct me if I misunderstood your input on this subject. Maybe we need to collect authentication requirements from potential Envoy Gateway users so we can design the API appropriately.
That use case is exactly the sort of thing I was talking about when I meant option 3 - I just have to finish the upstream work to change the wording of Policy Attachment - but I haven't got any objections to that part yet, so I'm reasonably confident it will go ahead.
@danehans, yes, there is typically a "standard" path for auth that is adhered to by the various API producers that I would want to enforce by default from the gateway admin perspective, though there does need to be an exception process by which other auth measures can be applied in cases where the auth being provided to the gateway cannot be adjusted.
As an example, if JWT is your standard auth then you would want to apply that across the various HTTPRoutes by default, however on a specific HTTPRoute you may may need to do an HMAC signature validation because it is a vendor webhook sending a request and they can only send an HMAC signature for validation. As a result, the ability to override a default auth behavior is key - though only as an exception, requiring a special privilege. Other types of overrides and/or other applied Policies/Filters may not need a special privilege to be applied. This may complicate user experience, but some flexibility in configuration is required I think.
Overall Policies does seem to be the correct way to go, with limited Filter usecases as mentioned by @arkodg so I'd have to go with option 3 as well
Overall I would prefer Policies over Filters from an end-user perspective if given the choice.
During today's community meeting, maintainers reached a consensus on this issue. The consensus is:
Unless any objections are received by @arkodg @skriss @youngnick @AliceProxy @LukeShu @Xunzhuo in 24 hours, I will close this issue.
updating this thread to mention that an ExtensionFilter was used for RateLimiting instead of the proposed PolicyAttachment decision because
Due to the above points, it made more sense to go ahead with a filter implementation that has a simpler user experience.
Description: Policy Attachment and HTTPRouteFilters represent two different extension points in Gateway API to implement features outside the core/extended APIs provided by the upstream project.
Envoy Gateway is moving forward with implementing more advanced features on top of the core Gateway API and part of this is developing guidelines for implementers/users on how the project will use Policies and Filters.
In this document I wrote up a design/discussion for Contour on differences between using Policies and Filters: https://github.com/projectcontour/contour/pull/4749 which should be relevant to this project as well. In particular the differences between implementing the case-study spikes for rate limiting as a Policy vs. Filter.
Some other key points to discuss that may or may not be included in the above document: