aws / aws-application-networking-k8s

A Kubernetes controller for Amazon VPC Lattice
https://www.gateway-api-controller.eks.aws.dev/
Apache License 2.0
157 stars 48 forks source link

Pod readiness gates are always injected #623

Closed xonstone closed 2 months ago

xonstone commented 2 months ago

Apparently this controller adds the pod readiness gates indifferent to wether or not a pod is actually targeted by a VPC Lattice TargetGroup. This is counter-intuitive and diverges from the way the AWS LoadBalancer Controller implements this and results in pods never being ready when they are not used in a service.

For the AWS LoadBalancer Controller to inject the gates a pod needs to fulfil the following requirements:

I would suggest to do something similar here.

zijun726911 commented 2 months ago

Hi xonstone, thanks for point out this issue. The current pod readiness gate injector webhook has default behaviour that inject readiness gates for all pods under the namespece with application-networking.k8s.aws/pod-readiness-gate-inject=enabled. However you are able to change this injection criteria by changing the mutatingwebhookconfigurations. You could check this doc for more detail: https://github.com/aws/aws-application-networking-k8s/blob/553422e886713c9ddf2aed29143db1dfe2e840e7/docs/guides/pod-readiness-gates.md?plain=1#L80-L103

Could this way solve your use case?

zijun726911 commented 2 months ago

Alternatively, you could disable the whole injector webhook and add readinessGates in you pod yaml definition by yourself, for example:

https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-readiness-gate

...
kind: Pod
...
spec:
  readinessGates:
    - conditionType: "application-networking.k8s.aws/pod-readiness-gate"
...
xonstone commented 2 months ago

Hi

I’ve implemented the object selector but this places the burden on the developer to always provide the correct labels which is quite error-prone.

In my opinion the original problem is not solved this way and it should be solved more akin to the way the ALB controller works

On Thu, 25 Apr 2024 at 07:38, Zijun Wang @.***> wrote:

Alternatively, you could disable the whole webook and add readinessGates in you pod yaml definition by yourself, for example:

https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-readiness-gate

... kind: Pod ... spec: readinessGates:

  • conditionType: "application-networking.k8s.aws/pod-readiness-gate" ...

— Reply to this email directly, view it on GitHub https://github.com/aws/aws-application-networking-k8s/issues/623#issuecomment-2076408006, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEGD2QZBPTENYARK6OWYOMDY7CJE5AVCNFSM6AAAAABGVKS7E6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANZWGQYDQMBQGY . You are receiving this because you authored the thread.Message ID: @.***>

erikfuller commented 2 months ago

To address this I think what makes sense is to use logic similar to what's in the event handler service.go. Then we can check the route to validate it's one the controller cares about like in route_controller.go.

This way we wouldn't have to make API calls to Lattice, just to local k8s API. Unit tests to check this should be pretty straightforward. With this change we'll need to update the webhook integration test to include a bit more infrastructure than just a service/pod (route, etc), and we can also validate the pod transitions to ready as expected.

erikfuller commented 2 months ago

Just a quick update here. I'm nearing a solution here but am hitting some issues with the current approach to version handling for the gateway APIs. Hoping to have a PR ready by EOW, but may be next week.