envoyproxy / gateway

Manages Envoy Proxy as a Standalone or Kubernetes-based Application Gateway
https://gateway.envoyproxy.io
Apache License 2.0
1.53k stars 328 forks source link

Gateway API extensions: Usage of Policy Attachment vs. Filters #675

Closed sunjayBhatia closed 1 year ago

sunjayBhatia commented 1 year ago

Description: Policy Attachment and HTTPRouteFilters represent two different extension points in Gateway API to implement features outside the core/extended APIs provided by the upstream project.

Envoy Gateway is moving forward with implementing more advanced features on top of the core Gateway API and part of this is developing guidelines for implementers/users on how the project will use Policies and Filters.

In this document I wrote up a design/discussion for Contour on differences between using Policies and Filters: https://github.com/projectcontour/contour/pull/4749 which should be relevant to this project as well. In particular the differences between implementing the case-study spikes for rate limiting as a Policy vs. Filter.

Some other key points to discuss that may or may not be included in the above document:

arkodg commented 1 year ago

@sunjayBhatia thanks for sharing all the research on this topic ! The EG community too needs a decide on which extension point to use for the supporting advanced API Gateway features. I'm sharing some options here, so the community can voice their preference.

  1. Use HTTPRoute Filters for all API Gateway features

    kind: RateLimit
    apiVersion: gateway.envoyproxy.io/v1alpha1
    metadata:
    name: rl-example
    spec:
    ---
    kind: HTTPRoute
    apiVersion: gateway.networking.k8s.io/v1beta1
    spec:
    parentRefs:
    - group: gateway.networking.k8s.io
     kind: Gateway
    .....
    hostnames:
    - "example.com"
    rules:
    - matches:
     - path:
         type: PathPrefix
         value: /
     backendRefs:
     - kind: Service
       name: example
       port: 8080
     filters:
     - type: ExtensionRef
       extensionRef:
         group: gateway.envoyproxy.io
         kind: RateLimit
         name: rl-example
  2. Use PolicyAttachment for all API Gateway features

    kind: RateLimit
    apiVersion: gateway.envoyproxy.io
    metadata:
    name: rl-policy
    namespace: demo
    spec:
    targetRef:
    group: gateway.networking.k8s.io
    kind: Gateway
    name: example
    override:
    ....
    default:
    ...
    1. On a case by case basis decide which extension to choose for a specific API Gateway feature
arkodg commented 1 year ago

@skriss @danehans @youngnick @AliceProxy @LukeShu @Xunzhuo can you please share your thoughts and preferences on this issue, please also consider sharing alternative options.

sunjayBhatia commented 1 year ago

Specifically targeting #1, this comment is relevant: https://github.com/envoyproxy/gateway/pull/529#discussion_r1012164693

Not all configuration can be translated to Envoy config from per-Route configuration, due to the nature of particular Envoy filters and how Envoy Gateway's xDS translation logic may set up Listeners/Virtualhosts/Routes

sunjayBhatia commented 1 year ago

In addition to thinking about how things will be implemented, we should consider in this decision what the intended UX we want is, which involves what resources/personas/scopes we want to target:

arkodg commented 1 year ago

Prefer 2 because

Imho 3 adds cognitive load on the user and 1 cannot be applied to Gateway as shared above

lizan commented 1 year ago

This has been discussed in the authn policy document as well in this comment. As a result of a brief community call we agreed on something like 2. We might want to add some sort of support with PolicyAttachment to select routes in an HTTPRoute though.

Also note that Filters in HTTPRoute does not directly translate to Envoy HTTP filters as they are configured at different scope and not all Envoy filters support per route basis and the support varies. The Envoy HTTP filters are created per HCM filter chain.

danehans commented 1 year ago

@arkodg I feel like https://github.com/envoyproxy/gateway/issues/675#issuecomment-1301376173 is a repeat of https://github.com/envoyproxy/gateway/issues/677. Can you please clarify?

the notion of default and override is powerful, addresses persona hierarchy

As stated upstream, this can become very complicated which leads to poor UX.

danehans commented 1 year ago

From the draft Gateway API patterns: Policy Attachment and Metaresources doc authored by @youngnick :

Policy Attachment: It’s all about the defaults and overrides

The current Policy Attachment docs list three primary components of the Policy Attachment pattern:

  • A standardized means of attaching policy to resources.
  • Support for configuring both default and override values within policy resources.
  • A hierarchy to illustrate how default and override values should interact.

I think that we should focus “Policy Attachment” around the latter two bullet points

If Policy Attachment should not be used as a standardized means of attaching a policy to resources, then a different mechanism should be used. Let's refer to #529 as a concrete example. A dev creates an HTTPRoute to route traffic to two different backend services. The dev now wishes to attach different JWT AuthN policies to the backend services. A resource should be used to attach these policies. First, the dev creates the AuthN resource, for example:

apiVersion: gateway.envoyproxy.io/v1alpha1
kind: AuthN
metadata:
  name: svc1-jwt-auth
spec:
  authType: JWT
  jwtConfig:
    <JWT config for service 1>

The dev repeats the above for service 2 and then attaches the AuthN resource to the HTTPRoute as an implementation-specific filter:

apiVersion: gateway.networking.k8s.io/v1beta1
kind: HTTPRoute
metadata:
  name: pol-attach-example
spec:
  parentRefs:
    - name: eg
  hostnames:
    - www.example.com
  rules:
    - matches:
      - path:
          type: PathPrefix
          value: /svc1
      filters:
      - type: ExtensionRef
        extensionRef:
          group: gateway.envoyproxy.io
          kind: AuthN
          name: svc1-jwt-auth
      backendRefs:
      - name: svc1
        port: 80
    - matches:
      - path:
          type: PathPrefix
          value: /svc2
      filters:
      - type: ExtensionRef
        extensionRef:
          group: gateway.envoyproxy.io
          kind: AuthN
          name: svc2-jwt-auth
      backendRefs:
      - name: svc2
        port: 80

Traffic is now authN'd to the two backend services. However, the SecOps team from example corp has more stringent JWT authN requirements. The cluster admin is now required to enforce minimum JWT authN settings and creates an AuthnPolicy that overrides the settings for all AuthN kinds. For example:

apiVersion: gateway.envoyproxy.io
kind: AuthnPolicy
metadata:
  name: svc1-jwt-auth
spec:
  targetRef:
    group: gateway.networking.k8s.io
    kind: AuthN
  override:
    jwtConfig:
      <INSERT_OVERRIDES>

The JWT authN config for both backend services is now overridden.

Although this approach requires two kinds, AuthN and AuthnPolicy, it does provide support for multiple personas inherent in Gateway API. RBAC is properly configured so that Devs are free to configure AuthN kinds for traffic routing and infra admins can configure AuthnPolicy to override traffic routing.

arkodg commented 1 year ago

@danehans thanks for raising this case about using two separate CRDs here - Authn and AuthnPolicy. Here are some follow up questions

sunjayBhatia commented 1 year ago

relevant discussion topic for this as well, given some of the documentation in the Policy Attachment reference: https://github.com/kubernetes-sigs/gateway-api/discussions/1503

currently Filters and Policies aren't "supposed" to overlap, but as you point out and I think others have realized as well, this pattern seems useful

youngnick commented 1 year ago

The poor UX issue will be addressed in upstream, I think that option 2 sounds about right here.

danehans commented 1 year ago

If the App Dev had the RBAC to apply AuthnPolicy to the HTTPRoute, shouldnt we be able to achieve the same outcome with 1 CRD type AuthnPolicy

Providing App Devs the ability to CRUD AuthNPolicy would defeat the purpose of separating concerns. App Devs should not be allowed to set or override default policies, e.g. AuthNPolicy. If we need the ability for App Devs to define authN for API endpoints and cluster admins to override/default authN, then separate resources are required.

arkodg commented 1 year ago

assigning myself this issue, so that I can help moderate this discussion to a decision.

Raising some further questions around users and features to help drive the debate Assumptions We are deciding extensions for these features

Questions around Capability

youngnick commented 1 year ago

Okay, I think I wasn't clear enough in my answer before, so let me clarify.

When talking about these features, we have a few extension points that can be used. For Auth and Ratelimiting, it conceptually makes sense that those be a filter that modifies a HTTPRoute match before it's sent on to the backends. That allows for very specific targeting of routes, headers, methods etc for these filters.

But, the downside of using a filter in that way is that then, the filter currently has to be specified for every place where it needs to be used. So if the cluster admin/Envoy Gateway owner wants to specify that all routes must have a global ratelimit, or a certain type of auth, then not only can they not do that, it's on the owner of each HTTPRoute to actually remember to.

So, what I'm going to be proposing in upstream is that there should be a class of Policy Attachment objects that modify the way these custom filters work, either defaulting or overriding the settings.

So, in this way, as a Gateway owner, I could attach a RatelimitingPolicy object to the Gateway, and set the default to be to use a specific ratelimit policy extensionref CRD, just as if every HTTPRoute rule attached to the Gateway specified the policy. This would mean that HTTPRoute owners would be able to override the default, but specifying their own extensionRef CRD.

Or, as a Gateway owner, I could attach a JWTAuthPolicy to the Gateway, that sets an override to use a specific JWTAuth CRD extensionRef, which an HTTPRoute owner cannot override by setting their own auth settings. (Would you actually want to do this for a large Gateway? Probably not, but that's what it's for).

So, that's how we could combine a custom filter and Policy objects to give both flexibility and control to users. Note that both of these examples require us to build a custom Filter first, then the Policy object to allow it to be overridden.

Alternatively, we could _not_use the extensionRef filters, and only define these functions in terms of Policy objects, and instead of having the ability to define a filter inline in the HTTPRoute YAML, just have these settings attached via a Policy.

Both of the Policy approaches do have the UX limitation that it's very difficult to know if a Policy is augmenting the settings at all - part of the Policy rewrite I'm working on upstream is about trying to design a status section that should make this more clear. (It's a bit tricky though).

So, overall, I'm for Option 3 - which is "use both where necessary", but that's not very helpful right now.

I think that in terms of experimenting and trying things out, a custom filter for both Ratelimiting and Auth are both less design work, and less conceptual overhead for users to understand, at the cost of increased verbosity. I'd recommend using those for now, which puts me actually partially in the Option 1 camp, for now. I think that as I get these changes designed and pushed into upstream, this whole situation should hopefully become clearer.

arkodg commented 1 year ago

@youngnick are there any plans on also supporting extensionRef in the Gateway API ? else from a capability perspective EG will be unable to provide support for extended features at a connection / listener level with just extensionRef filters at the HTTPRoute level

danehans commented 1 year ago

But, the downside of using a filter in that way is that then, the filter currently has to be specified for every place where it needs to be used. So if the cluster admin/Envoy Gateway owner wants to specify that all routes must have a global ratelimit, or a certain type of auth, then not only can they not do that, it's on the owner of each HTTPRoute to actually remember to.

Thanks for clarifying. From reviewing your Policy Attachment doc, this was my understanding and why #529 is designed as an implementation-specific filter. I anticipate EG supporting an AuthenticationPolicy resource in the future that would allow admins to override and/or set default settings defined by an Authentication resource.

Both of the Policy approaches do have the UX limitation that it's very difficult to know if a Policy is augmenting the settings at all - part of the Policy rewrite I'm working on upstream is about trying to design a status section that should make this more clear. (It's a bit tricky though).

This is one of my concerns with Policy Attachment and highlighted by @sunjayBhatia in https://github.com/kubernetes-sigs/gateway-api/discussions/1503.

arkodg commented 1 year ago

here's my current take based on three pillars

Based on this, I am revoting for option 3 (case by case basis). I think Policy should be considered as the default choice of extension, and a Filter should be considered if ALL of the below are true

danehans commented 1 year ago

The feature is not privileged (anything relating to security should be considered privileged) that does not need to be applied by a privileged persona / platform admin on a privileged resource / Gateway.

We have already heard feedback from @jcheld for the desire to allow App devs to configure auth for endpoints (HTTPRoute filter) but allow cluster admins to override (Policy Attachment). @jcheld please correct me if I misunderstood your input on this subject. Maybe we need to collect authentication requirements from potential Envoy Gateway users so we can design the API appropriately.

youngnick commented 1 year ago

That use case is exactly the sort of thing I was talking about when I meant option 3 - I just have to finish the upstream work to change the wording of Policy Attachment - but I haven't got any objections to that part yet, so I'm reasonably confident it will go ahead.

jcheld commented 1 year ago

@danehans, yes, there is typically a "standard" path for auth that is adhered to by the various API producers that I would want to enforce by default from the gateway admin perspective, though there does need to be an exception process by which other auth measures can be applied in cases where the auth being provided to the gateway cannot be adjusted.

As an example, if JWT is your standard auth then you would want to apply that across the various HTTPRoutes by default, however on a specific HTTPRoute you may may need to do an HMAC signature validation because it is a vendor webhook sending a request and they can only send an HMAC signature for validation. As a result, the ability to override a default auth behavior is key - though only as an exception, requiring a special privilege. Other types of overrides and/or other applied Policies/Filters may not need a special privilege to be applied. This may complicate user experience, but some flexibility in configuration is required I think.

Overall Policies does seem to be the correct way to go, with limited Filter usecases as mentioned by @arkodg so I'd have to go with option 3 as well

AliceProxy commented 1 year ago

Overall I would prefer Policies over Filters from an end-user perspective if given the choice.

danehans commented 1 year ago

During today's community meeting, maintainers reached a consensus on this issue. The consensus is:

Unless any objections are received by @arkodg @skriss @youngnick @AliceProxy @LukeShu @Xunzhuo in 24 hours, I will close this issue.

arkodg commented 1 year ago

updating this thread to mention that an ExtensionFilter was used for RateLimiting instead of the proposed PolicyAttachment decision because

Due to the above points, it made more sense to go ahead with a filter implementation that has a simpler user experience.