gardener / gardener-extension-provider-aws

Gardener extension controller for the AWS cloud provider (https://aws.amazon.com).
https://gardener.cloud
Apache License 2.0
19 stars 97 forks source link

aws-custom-route-controller events are wrongly categorized as `ERR_RETRYABLE_INFRA_DEPENDENCIES` instead of `ERR_INFRA_UNAUTHENTICATED` #1137

Open ialidzhikov opened 1 week ago

ialidzhikov commented 1 week ago

How to categorize this issue?

/area ops-productivity /kind bug /platform aws

What happened: Shoot credentials got invalid. Then the ControlPlane got unhealthy with:

status:
  conditions:
  - codes:
    - ERR_RETRYABLE_INFRA_DEPENDENCIES
    lastTransitionTime: "2024-11-20T10:37:28Z"
    lastUpdateTime: "2024-11-20T10:49:01Z"
    message: "[aws-custom-route-controller] RoutesUpdateFailed: (combined from similar
      events): AuthFailure: AWS was not able to validate the provided access credentials\n\tstatus
      code: 401, request id: <id>."
    reason: HealthCheckUnsuccessful
    status: "False"
    type: ControlPlaneHealthy

IMO, the ERR_RETRYABLE_INFRA_DEPENDENCIES is wrong. It should be ERR_INFRA_UNAUTHENTICATED.

In https://github.com/gardener/gardener-extension-provider-aws/blob/95bfe29a9df3ee638a7df0ceb02c63f5b19c4deb/pkg/apis/aws/helper/error_codes.go#L14, the error string AuthFailure is already marked as ERR_INFRA_UNAUTHENTICATED.

The custom handling for the events: https://github.com/gardener/gardener-extension-provider-aws/blob/95bfe29a9df3ee638a7df0ceb02c63f5b19c4deb/pkg/controller/healthcheck/customroutecontrollerhealth.go#L62-L67

What you expected to happen: AuthFailure error to be flagged with ERR_INFRA_UNAUTHENTICATED, not with ERR_RETRYABLE_INFRA_DEPENDENCIES.

How to reproduce it (as minimally and precisely as possible): See above.

Anything else we need to know?: N/A

Environment: