aws / aws-application-networking-k8s

A Kubernetes controller for Amazon VPC Lattice
https://www.gateway-api-controller.eks.aws.dev/
Apache License 2.0
157 stars 48 forks source link

Conditionally inject pod readiness gates #629

Closed erikfuller closed 2 months ago

erikfuller commented 2 months ago

What type of PR is this? bug fix

Which issue does this PR fix:

623

What does this PR do / Why do we need it: Validates the pod requires the readiness gate before adding it. Readiness gates are added to pods where

  1. The pod is referenced by a service's selector
  2. That service is referenced by a HTTPRoute, GRPCRoute, or ServiceExport
  3. The route has a parentRef to a VPC Lattice gateway

Under these conditions, it is assumed the pod will be a member of a Lattice target group and the readiness gate is added. All other pods will not receive the readiness gate.

Other minor issues addressed:

Testing done on this change: Added extensive unit tests on the logic.

Automation added to e2e: Updated the webhook e2e tests to include a full test where we check the readiness gate is added and transitions to "ready". Note this test takes a little under 2m to run.

Also Ran the regular e2e-tests just to make sure my small refactoring didn't break anything.

=== RUN   TestIntegration
Running Suite: WebhookIntegration - /.../aws-application-networking-k8s/test/suites/webhook
===============================================================================================================
Random Seed: 1714689615

Will run 3 of 3 specs
------------------------------
[SynchronizedBeforeSuite] 
/.../aws-application-networking-k8s/test/suites/webhook/suite_test.go:25
{"level":"info","ts":"2024-05-02T15:40:15.310-0700","caller":"test/framework.go:186","msg":"Creating objects: *v1.Gateway/test-gateway"}
{"level":"info","ts":"2024-05-02T15:40:16.598-0700","caller":"webhook/suite_test.go:37","msg":"Expecting VPC vpc-037b0d82a657521a0 and service network sn-009776704d1ed9cdb association"}
[SynchronizedBeforeSuite] PASSED [1.441 seconds]
------------------------------
Readiness Gate Inject create deployment in untagged namespace, no readiness gate
/.../aws-application-networking-k8s/test/suites/webhook/readiness_gate_inject_test.go:43
{"level":"info","ts":"2024-05-02T15:40:16.794-0700","caller":"test/framework.go:286","msg":"Waiting for NotFound, objects: *v1.Namespace/webhook-e2e-test-no-tag, *v1.Namespace/webhook-e2e-test-tagged"}
{"level":"info","ts":"2024-05-02T15:40:16.808-0700","caller":"test/framework.go:186","msg":"Creating objects: *v1.Namespace/webhook-e2e-test-no-tag, *v1.Namespace/webhook-e2e-test-tagged"}
{"level":"info","ts":"2024-05-02T15:40:16.926-0700","caller":"test/pod_manager.go:68","msg":"deployment.Spec.Selector.MatchLabels: map[app:untagged-test-pod]"}
{"level":"info","ts":"2024-05-02T15:40:27.007-0700","caller":"test/pod_manager.go:68","msg":"deployment.Spec.Selector.MatchLabels: map[app:untagged-test-pod]"}
• [10.272 seconds]
------------------------------
Readiness Gate Inject create deployment in tagged namespace, but no gateway/route reference, no readiness gate
/.../aws-application-networking-k8s/test/suites/webhook/readiness_gate_inject_test.go:64
{"level":"info","ts":"2024-05-02T15:40:27.092-0700","caller":"test/pod_manager.go:68","msg":"deployment.Spec.Selector.MatchLabels: map[app:tagged-nope-pod]"}
• [0.089 seconds]
------------------------------
Readiness Gate Inject create deployment in tagged namespace, gate injected and transitions to healthy
/.../aws-application-networking-k8s/test/suites/webhook/readiness_gate_inject_test.go:85
{"level":"info","ts":"2024-05-02T15:40:27.225-0700","caller":"test/pod_manager.go:68","msg":"deployment.Spec.Selector.MatchLabels: map[app:tagged-yes-pod]"}
{"level":"info","ts":"2024-05-02T15:40:37.274-0700","caller":"test/pod_manager.go:68","msg":"deployment.Spec.Selector.MatchLabels: map[app:tagged-yes-pod]"}
{"level":"info","ts":"2024-05-02T15:40:47.318-0700","caller":"test/pod_manager.go:68","msg":"deployment.Spec.Selector.MatchLabels: map[app:tagged-yes-pod]"}
{"level":"info","ts":"2024-05-02T15:40:57.362-0700","caller":"test/pod_manager.go:68","msg":"deployment.Spec.Selector.MatchLabels: map[app:tagged-yes-pod]"}
{"level":"info","ts":"2024-05-02T15:41:07.408-0700","caller":"test/pod_manager.go:68","msg":"deployment.Spec.Selector.MatchLabels: map[app:tagged-yes-pod]"}
{"level":"info","ts":"2024-05-02T15:41:17.452-0700","caller":"test/pod_manager.go:68","msg":"deployment.Spec.Selector.MatchLabels: map[app:tagged-yes-pod]"}
{"level":"info","ts":"2024-05-02T15:41:27.496-0700","caller":"test/pod_manager.go:68","msg":"deployment.Spec.Selector.MatchLabels: map[app:tagged-yes-pod]"}
{"level":"info","ts":"2024-05-02T15:41:37.558-0700","caller":"test/pod_manager.go:68","msg":"deployment.Spec.Selector.MatchLabels: map[app:tagged-yes-pod]"}
{"level":"info","ts":"2024-05-02T15:41:37.575-0700","caller":"test/framework.go:268","msg":"Deleting objects: *v1.Namespace/webhook-e2e-test-no-tag, *v1.Namespace/webhook-e2e-test-tagged"}
{"level":"info","ts":"2024-05-02T15:41:37.594-0700","caller":"test/framework.go:286","msg":"Waiting for NotFound, objects: *v1.Namespace/webhook-e2e-test-no-tag, *v1.Namespace/webhook-e2e-test-tagged"}
• [100.222 seconds]
------------------------------
[SynchronizedAfterSuite] 
/.../aws-application-networking-k8s/test/suites/webhook/suite_test.go:55
{"level":"info","ts":"2024-05-02T15:42:07.333-0700","caller":"test/framework.go:268","msg":"Deleting objects: *v1.Gateway/test-gateway"}
{"level":"info","ts":"2024-05-02T15:42:07.355-0700","caller":"test/framework.go:286","msg":"Waiting for NotFound, objects: *v1.Gateway/test-gateway"}
[SynchronizedAfterSuite] PASSED [1.055 seconds]
------------------------------

Ran 3 of 3 Specs in 113.080 seconds
SUCCESS! -- 3 Passed | 0 Failed | 0 Pending | 0 Skipped
--- PASS: TestIntegration (113.08s)
PASS
ok      github.com/aws/aws-application-networking-k8s/test/suites/webhook   114.240s

Will this PR introduce any new dependencies?: No

Will this break upgrades or downgrades. Has updating a running cluster been tested?: No

Does this PR introduce any user-facing change?: No, bug fix.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

erikfuller commented 2 months ago

For large cluster, to improve the performance and we can annotate or add finalizer to the service used by Lattice HTTPRoute/GRPCRoute/ServiceExport. With annotation or finalizer, the controller do not need loop through them during creation of a Pod

Thanks for bringing this up. I've opened https://github.com/aws/aws-application-networking-k8s/issues/631 to track these improvements.