gardener / dashboard

Web-based GUI for Gardener installations.
Apache License 2.0
208 stars 103 forks source link

No operator action required for `CRDsWithProblematicConversionWebhooks` #2108

Open acumino opened 1 month ago

acumino commented 1 month ago

What would you like to be added: Currently, clusters with CRDsWithProblematicConversionWebhooks condition true are not marked for no operator action required. If this condition is present on the cluster it should be marked for no operator action required.

Why is this needed: This issue occurs when a user has CRDs which doesn't follow best practices, operator can't do anything to fix it. To keep dashboard clean we should mark CRDsWithProblematicConversionWebhooks condition true as no operator action required.

petersutter commented 1 month ago

Currently, the dashboard checks the error codes under status.lastErrors, where there is also ERR_PROBLEMATIC_WEBHOOK, which we flag as a user error. What lastError is present when the CRDsWithProblematicConversionWebhooks condition is true?

petersutter commented 1 month ago

ping @acumino

acumino commented 1 day ago

CRDsWithProblematicConversionWebhooks is just a constraint in shoot status and not an error. It should not be considered for no operator action required since it can lead to other errors being ignored.

  constraints:
    - type: CRDsWithProblematicConversionWebhooks
      status: 'False'
      lastTransitionTime: ''
      lastUpdateTime: ''
      reason: CRDsWithProblematicConversionWebhooks
      message: >-
        Some CRDs in your cluster have multiple stored versions present and have
        a conversion webhook configured: <webhook-name>. Please see
        https://github.com/gardener/gardener/blob/master/docs/usage/shoot/shoot_status.md#constraints
        for more details.
acumino commented 1 day ago

As of now, when sorting the clusters in the dashboard based on the Issue since, this constraint is also considered. It would be better if this is not considered as this gives the impression that the cluster has an issue for a very long time, even if the actual error is a transient error of few seconds.