fluxcd / helm-controller

The GitOps Toolkit Helm reconciler, for declarative Helming
https://fluxcd.io
Apache License 2.0
415 stars 164 forks source link

gotk_reconcile_condition "{status="False", type="Ready"} 1" when reconciling, should be "Unknown" #890

Closed dkulchinsky closed 9 months ago

dkulchinsky commented 9 months ago

Describe the bug

While a HelmRelease is reconciling, the HelmRelease resource Ready status is Unknown (as expected)

❯ flux get hr -n platform-logging
NAME                REVISION    SUSPENDED   READY   MESSAGE
fluentd-plat-usc1   1.8.0       False       Unknown Reconciliation in progress

However, at the same time gotk_reconcile_condition metric has the following time series:

{kind="HelmRelease", name="fluentd-plat-usc1", namespace="platform-logging", status="False", type="Ready"} 1
{kind="HelmRelease", name="fluentd-plat-usc1", namespace="platform-logging", status="True", type="Ready"} 0
{kind="HelmRelease", name="fluentd-plat-usc1", namespace="platform-logging", status="Unknown", type="Ready"} 0

We were expecting that {status="Unknown", type="Ready"} would be 1

Steps to reproduce

  1. Deploy a HelmRelease
  2. Modify the HelmRelease to trigger an update
  3. Observe the gotk_reconcile_condition time series for said HelmRelease

Expected behavior

gotk_reconcile_condition Ready status should match the status of the HelmRelease resource

Screenshots and recordings

No response

OS / Distro

N/A

Flux version

flux: v2.1.2

Flux check

► checking prerequisites ✗ flux 2.1.2 <2.2.3 (new version is available, please upgrade) ✔ Kubernetes 1.26.12-gke.1111000 >=1.25.0-0 ► checking controllers ✔ helm-controller: deployment ready ► /external/fluxcd/helm-controller:v0.36.2 ✔ kustomize-controller: deployment ready ► /external/fluxcd/kustomize-controller:v1.1.1 ✔ notification-controller: deployment ready ► /external/fluxcd/notification-controller:v1.1.0 ✔ source-controller: deployment ready ► c/external/fluxcd/source-controller:v1.1.2 ► checking crds ✔ alerts.notification.toolkit.fluxcd.io/v1beta2 ✔ buckets.source.toolkit.fluxcd.io/v1beta2 ✔ gitrepositories.source.toolkit.fluxcd.io/v1 ✔ helmcharts.source.toolkit.fluxcd.io/v1beta2 ✔ helmreleases.helm.toolkit.fluxcd.io/v2beta1 ✔ helmrepositories.source.toolkit.fluxcd.io/v1beta2 ✔ kustomizations.kustomize.toolkit.fluxcd.io/v1 ✔ ocirepositories.source.toolkit.fluxcd.io/v1beta2 ✔ providers.notification.toolkit.fluxcd.io/v1beta2 ✔ receivers.notification.toolkit.fluxcd.io/v1 ✔ all checks passed

Git provider

GitHUb

Container Registry provider

Harbor

Additional context

No response

Code of Conduct

darkowlzz commented 9 months ago

Hi, this issues seems to be for an old version of helm-controller. helm-controller v0.37.0 (part of flux v2.2.x) did a major rewrite and dropped deprecated metrics, see https://github.com/fluxcd/helm-controller/commit/0919fb4c2447afdc6012f4423df676e2e0c9ed9f. gotk_reconcile_condition is a deprecated metric, see the docs https://fluxcd.io/flux/monitoring/metrics/#warning-deprecated-resource-metrics . Such condition metrics are now scraped using kube-state-metrics as documented in https://fluxcd.io/flux/monitoring/metrics/. Those metrics will contain the correct value for the ready status.

dkulchinsky commented 9 months ago

Thanks @darkowlzz, will close this and look into upgrading to flux v2.2

runningman84 commented 4 weeks ago

ghcr.io/fluxcd/helm-controller:v1.0.1 does not provide any metric with the name: gotk_reconcile_condition I only see metrics like: gotk_reconcile_duration_seconds_bucket gotk_reconcile_duration_seconds_count gotk_reconcile_duration_seconds_sum