aws / aws-application-networking-k8s

A Kubernetes controller for Amazon VPC Lattice
https://www.gateway-api-controller.eks.aws.dev/
Apache License 2.0
175 stars 50 forks source link

Tls Passthrough support #643

Closed zijun726911 closed 5 months ago

zijun726911 commented 6 months ago

Initial version cherry picked from: https://github.com/liwenwu-amazon/aws-application-networking-k8-publics/tree/tls-route-support-mar20 User experiences about TLSRoute, TargetGroupPolicy of this PR are same with the liwenwu-amazon:tls-route-support-mar20 branch.

main branch to liwenwu-amazon:tls-route-support-mar20 branch diff: https://github.com/zijun726911/aws-application-networking-k8s/pull/3

TLS Passthrough Behaviors Summary

Changes

The controller logic changes:

e2e test changes

CRDs && Image build && helm chart related changes

Testing

[SynchronizedAfterSuite]
/Volumes/workspace/aws-application-networking-k8s/test/suites/integration/suite_test.go:72
{“level”:“info”,“ts”:“2024-06-04T01:32:32.262-0700",“caller”:“test/framework.go:264",“msg”:“Deleting objects: *v1.Gateway/test-gateway, *v1.Pod/grpc-runner”}
{“level”:“info”,“ts”:“2024-06-04T01:32:32.312-0700",“caller”:“test/framework.go:282",“msg”:“Waiting for NotFound, objects: *v1.Gateway/test-gateway, *v1.Pod/grpc-runner”}
[SynchronizedAfterSuite] PASSED [31.091 seconds]
------------------------------
Ran 66 of 69 Specs in 2537.133 seconds
SUCCESS! -- 66 Passed | 0 Failed | 0 Pending | 3 Skipped

Changes for the new revision: https://github.com/aws/aws-application-networking-k8s/pull/643#discussion_r1629937229

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

erikfuller commented 5 months ago

One other question around TLSRouteRule/BackendRef from the spec: https://github.com/kubernetes-sigs/gateway-api/blob/main/apis/v1alpha2/tlsroute_types.go#L103

    // BackendRefs defines the backend(s) where matching requests should be
    // sent. If unspecified or invalid (refers to a non-existent resource or
    // a Service with no endpoints), the rule performs no forwarding; if no
    // filters are specified that would result in a response being sent, the
    // underlying implementation must actively reject request attempts to this
    // backend, by rejecting the connection or returning a 500 status code.
    // Request rejections must respect weight; if an invalid backend is
    // requested to have 80% of requests, then 80% of requests must be rejected
    // instead.

Just curious what our current approach does with invalid BackendRefs. I think this is probably a less vital part of the spec to honour, especially given it's still experiemental, but would be good to describe the expected behaviour here.

zijun726911 commented 5 months ago

One other question around TLSRouteRule/BackendRef from the spec: https://github.com/kubernetes-sigs/gateway-api/blob/main/apis/v1alpha2/tlsroute_types.go#L103

  // BackendRefs defines the backend(s) where matching requests should be
  // sent. If unspecified or invalid (refers to a non-existent resource or
  // a Service with no endpoints), the rule performs no forwarding; if no
  // filters are specified that would result in a response being sent, the
  // underlying implementation must actively reject request attempts to this
  // backend, by rejecting the connection or returning a 500 status code.
  // Request rejections must respect weight; if an invalid backend is
  // requested to have 80% of requests, then 80% of requests must be rejected
  // instead.

Just curious what our current approach does with invalid BackendRefs. I think this is probably a less vital part of the spec to honour, especially given it's still experiemental, but would be good to describe the expected behaviour here.

I will document this part. actually the current behavior is, for example for this TLSRoute:

apiVersion: gateway.networking.k8s.io/v1alpha2
kind: TLSRoute
metadata:
  name: rate-tls-passthrough
spec:
  hostnames:
    - tls-rate.my-test.com
  parentRefs:
    - name: my-hotel-tls-passthrough
      sectionName: tls
  rules:
    - backendRefs:
        - name: tls-rate1
          kind: Service
          port: 443
          weight: 10
        - name: tls-rate2
          kind: ServiceImport
          port: 443
          weight: 90

If the tls-rate2 targetGroup is not ready, it will block the traffic for both tls-rate1 and tls-rate2

( base on the ResolveRuleTgIds() logic https://github.com/aws/aws-application-networking-k8s/blob/595fff8ab63ef901b81b7ce0d63a2461be4d1865/pkg/deploy/lattice/target_group_manager.go#L459 )

zijun726911 commented 5 months ago

The reason I'd like to change is that right now we have a few comment lines explaining why the code should drop through to the else for TLS passthrough. A multi-line explanation in comments can be a warning sign we don't have the code structured correctly. I'd like us to rather take an approach that is explicit, intuitive to follow, and needs no justifying or explanatory comments. I think what I proposed (or something like it) would do the trick.

Ideally we would redefine the interfaces so there were separate L4 and L7 interfaces, and there would be no Matches() at all on TCPRoute. This would be a larger change and is probably OK to skip for now, but we should not be relying on the empty list behavior here.

Changed in the new revision. please check: https://github.com/aws/aws-application-networking-k8s/pull/643#discussion_r1629937229

Should we just drop this Expect() then since we don't really care?

Removed assertion for the tg port in the new revision.