fluxninja / aperture

Rate limiting, caching, and request prioritization for modern workloads
https://docs.fluxninja.com
Apache License 2.0
643 stars 25 forks source link

Enable ext_authz body buffering without breaking client-streaming APIs #479

Open krdln opened 2 years ago

krdln commented 2 years ago

Ext_authz body buffering breaks client-streaming APIs, such as eg. GRPC client-streaming method or bidirectional-streaming methods.

Figure out how to enable body buffering without breaking such client-streaming APIs.

This is the root cause of https://github.com/fluxninja/cloud/issues/5862

Desired Outcomes

  1. Find out if it's possible to ignore streaming connections using match rules
  2. If not, document how we can exclude such traffic (based on destination IP ranges and ports) using Istio's exclude rules.
krdln commented 2 years ago

Pausing work on this for now.

Status: Spent a couple days on researching following questions:

  1. What are options in istio/envoy on how to configure filters conditionally,
  2. What option do we recommend in our docs,
  3. Is there a "sane default" configuration which we could recommend, that would satisfy both of following:
    • classification works for "obviously simple cases", like pulling some fields from json request;
    • body buffering is disabled for potentially streaming apis (like with application/grpc).

More details:

3. Sane default

It's possible to configure ext authz filter based on headers (using header to metadata filter and filter enabled metadata field). I explored two possibilities:

  1. content-length presence: Having content-length set is a good heuristic for being "body buffering eligible" – streaming APIs won't set this field. This is promising, but turns out that this field is frequently omitted in HTTP/2, so abandonded this idea.
  2. content-type whitelisting/blacklisting: There are some content types that are used for streaming APIs – (application/grpc (might be streaming or not), application/stream+json, application/json-seq, application/x-ndjson, ...), so such kind of blacklist initially sounded promising. Unfortunately, application/json is also used for json-streaming in the wild (despite RFC defining it's as single object/array). An example of application/json mimetype being used for streaming is in gRPC-Gateway

1. Configuring filters conditionally in istio

Btw, we're using envoy.filters.http.ext_authz as the name for our filter. Looks like the limitation of filter name matching extension's name was lifted some time ago (along with introduction of typed_config). We should use some more unique name here, like aperture.ext_authz – this is a prerequisite for all the filter overrides and route-specific config described below.

AuthorizationPolicy

Learned that istio provides a specific CRD for external authorization – it's a simpler way to define ext authz, compared to EnvoyFilter. After some reading though, this doesn't seem to fit our usecase – it's supposed to define "the external authorization", but what we want is to "just reuse ext_authz plugin for our purposes", without affecting any existing authz – and istio allows only a single ext authz provider.

Per-namespace/per-workload configuration

This is the simplest and least error-prone way of conditionally reconfiguring envoy-filter, applies on a pod-level.

Per-listener/per-route configuration

Listeners and routes in envoy are managed by istio, and while it's possible to apply a filter only for a specific listener (or provide route-specific config for ext authz), listener and route management in istio is a complex topic. Needs a bit more research though.

header-to-metadata + filter_enabled_metadata

Allows for an ad-hoc header-based rules. It's more of a hack than a proper solution, and allows only very simple rules based on a single-header.

ExtensionWithMatcher

Clean solution (if I understood it correctly), alpha though.

Custom metadata-setting filter

This solution seems a bit crazy, especially since we've abandoned the idea of WASM plugin, but perhaps it's nice solution? The plugin would be responsible for determining which configuration authz should use, and all the actual logic would be done by authz.

2. What do we recommend

Not sure. Per-namespace/per-workload configuration is simplest, although very coarse-grained – cannot achieve per-api whitelisting.

Per-route configuration is the most elegant from envoy POV, although how istio configures routes for envoy is kind of istio's internal detail, thus this solution is a bit brittle. Probably, user should need to apply a VirtualService.

tanveergill commented 2 years ago

@krdln, Can we make it off by default and the user can selectively turn on per route?

The Istio Helm chart for Aperture integration can take a parameter to turn on body buffering. Also, need configurability of routes. cc: @kklimonda-fn, @hdkshingala

krdln commented 2 years ago

@tanveergill The per-route story is a bit complicated, as route and listeners mgmt in istio is not trivial, it would require adding routes via a separate crd, ensuring we don't break any pre-existing route setup, etc.

What we can do as initial step would be to have per-workload config. So eg. in our config we could have sth like:

defaultBodyBuffering: false
enableBodyBufferingOnWorkloads:
- bookinfo/frontend
- default/xyz

if you think this is worthwhile, I can guide @hdkshingala on how to do it, as I was researching it.