kumahq / kuma

🐻 The multi-zone service mesh for containers, Kubernetes and VMs. Built with Envoy. CNCF Sandbox Project.
https://kuma.io/install
Apache License 2.0
3.54k stars 327 forks source link

Rework Mesh*Route matching #10247

Open lahabana opened 2 weeks ago

lahabana commented 2 weeks ago

Description

With MeshService policy matching (https://github.com/kumahq/kuma/pull/10152) and policy on namespace (https://github.com/kumahq/kuma/pull/10148) we're getting a good story for producer/consumer policies and how to "attach a policy to a service"

The choice was made for spec.to[].targetRef.kind: MeshService this goes a little against how MeshHTTPRoute matches which was decided in MADR-028. The strong argument for going with top level targetRef in MADR-28 was matching dataplanes that don't have the route. In this case it's simple enough we just ignore it.

This current implementation has made things a little tricky because policy matching is convoluted where the top level MeshHTTPRoute gets converted to a "fake to":

https://github.com/kumahq/kuma/blob/55fa983fc12bdd1a5ea7c1cfe75901320870372b/pkg/plugins/policies/core/rules/rules.go#L346-L401

suggested changes

  1. Allow spec.to[].targetRef.kind: Mesh*Route working like MeshService (If the service has this route then we apply the matcher on it). It's also great because we can add validation like (disallow TCP parameters on MeshHTTPRoute matches).
  2. Deprecate MeshService and MeshHTTPRoute as a toplevel targetRef (top level targetRef should identify entire proxies) [^1]
  3. Make topLevel targetRef optional (in practice people will most often use targetRef: kind: Mesh). Having it optional creates less confusion to users about "where they should match stuff".

benefits

open questions

xref

examples

A producer attached policy (all consumers everywhere use this):

kind: MeshTimeout
metadata:
  name: timeout-for-service
spec:
  to:
    - targetRef:
         kind: MeshHTTPRoute
         name: my-cool-route-in-the-same-namespace
       default:
         http:
            requestTimeout: 5s

A consumer policy:

kind: MeshTimeout
metadata:
  name: timeout-for-service
spec:
  to:
    - targetRef:
         kind: MeshHTTPRoute
         labels: 
           kuma.io/display-name: my-cool-route-in-the-same-namespace
           k8s.kuma.io/namespace: some-ns
           kuma.io/zone: zone-a
       default:
         http:
            requestTimeout: 5s

A producer policy applying to just some proxies:

kind: MeshTimeout
metadata:
  name: timeout-for-service
spec:
  targetRef:
    kind: MeshSubset
    tags:
       type: "canaries"
  to:
    - targetRef:
         kind: MeshHTTPRoute
         name: my-cool-route-in-the-same-namespace
       default:
         http:
            requestTimeout: 5s

(the consumer ones work the same).

[^1]: Mesh, MeshSubset and MeshGateway would be the only topLevel targetRef allowed. We talked about remove MeshSubset but that's a parallel discussion. MeshGateway can identify a specific listener but in practice Each Gateway listener is very similar to being its own proxy so I'm not concerned here.

jakubdyszkiewicz commented 2 weeks ago

I like this change. I still hold my opinion to treat MeshHTTPRoute just like MeshService, so to forbid it in top-level target ref.

lahabana commented 2 weeks ago

I like this change. I still hold my opinion to treat MeshHTTPRoute just like MeshService, so to forbid it in top-level target ref.

Agreed which is why I suggest starting by deprecating it.