fluxcd / flux2

Open and extensible continuous delivery solution for Kubernetes. Powered by GitOps Toolkit.
https://fluxcd.io
Apache License 2.0
6.34k stars 591 forks source link

kstatus Unable to Health Check Objects with String Typed status.observedGeneration #4632

Open isugimpy opened 6 months ago

isugimpy commented 6 months ago

Describe the bug

When attempting to reconcile resources with status.observedGeneration using type string instead of int64, health checking fails due to not being able to parse the field. An example of this is observed in https://github.com/fluxcd/flux2/discussions/1476 as well as mentioned on the Istio repo. Currently, I'm experiencing this when attempting to use Argo Rollouts with Flux.

Steps to reproduce

  1. Configure a Kustomization with spec.wait: true, containing a rollouts.argoproj.io/v1alpha1
  2. Wait to reconcile
  3. Observe the wait failing due to a message like the one in the attached screenshot.

Expected behavior

Flux should find that the field exists and attempt to convert the string to an int64, only failing if it cannot be parsed as an integer.

Screenshots and recordings

image

OS / Distro

N/A

Flux version

N/A

Flux check

► checking prerequisites
✔ Kubernetes 1.27.10+k3s2 >=1.26.0-0
► checking version in cluster
✔ distribution: flux-v2.0.1
✔ bootstrapped: true
► checking controllers
✔ helm-controller: deployment ready
► ghcr.io/fluxcd/helm-controller:v0.35.0
✔ kustomize-controller: deployment ready
► ghcr.io/fluxcd/kustomize-controller:v1.0.1
✔ notification-controller: deployment ready
► ghcr.io/fluxcd/notification-controller:v1.0.0
✔ source-controller: deployment ready
► ghcr.io/fluxcd/source-controller:v1.0.1
► checking crds
✔ alerts.notification.toolkit.fluxcd.io/v1beta2
✔ buckets.source.toolkit.fluxcd.io/v1beta2
✔ gitrepositories.source.toolkit.fluxcd.io/v1
✔ helmcharts.source.toolkit.fluxcd.io/v1beta2
✔ helmreleases.helm.toolkit.fluxcd.io/v2beta1
✔ helmrepositories.source.toolkit.fluxcd.io/v1beta2
✔ kustomizations.kustomize.toolkit.fluxcd.io/v1
✔ ocirepositories.source.toolkit.fluxcd.io/v1beta2
✔ providers.notification.toolkit.fluxcd.io/v1beta2
✔ receivers.notification.toolkit.fluxcd.io/v1
✔ all checks passed

Git provider

N/A

Container Registry provider

N/A

Additional context

I've tested this on 2.0.1 and 2.2.3, observing the same behavior (though, as I recall, 2.0.1 may not be using kstatus). This appears to be a simple patch in https://github.com/fluxcd/cli-utils/blob/5af6753e42af4622cd7d6e16ffe1fb2f946a2103/pkg/kstatus/status/generic.go#L73-L99, which I've written up and tested locally, and am willing to submit if it'd be accepted.

Code of Conduct

isugimpy commented 6 months ago

Loosely related to this, would Flux maintainers be open to adding more known objects to kstatus beyond simply core and generic?

souleb commented 6 months ago

Loosely related to this, would Flux maintainers be open to adding more known objects to kstatus beyond simply core and generic?

There is an rfc in the work that will address this: https://github.com/fluxcd/flux2/pull/4528