Open seaneagan opened 7 months ago
can you give more information on the issue?
The code is good for me. We expected a dependency to be ready only if the current version have been successfully reconciled.
The Ready
condition's ObservedGeneration
is reliably getting updated, but not .Status.ObservedGeneration
, which is resulting in "dependency not ready" even though it is ready. I think these two fields are mostly duplicative, but probably too late to remove .Status.ObservedGeneration
at this point anyway? But I wonder if they should simply mirror each other?
@seaneagan I don't see how .Status.ObservedGeneration
is not updated if the release upgrade succeeds given: https://github.com/fluxcd/helm-controller/blob/5e760db4a83ffd384012c1b47e4615cc07d4081d/internal/controller/helmrelease_controller.go#L180-L182
Can you please add a unit test to prove this can actually happen?
Removing .Status.ObservedGeneration
is not an option, this field is used by kstatus to determine readiness and almost all standard Kubernetes APIs have it. If .Status.ObservedGeneration
is not kept up to date, then Flux kustomize-controller would be stuck waiting for HelmReleases which no one reported so far, so I'm really puzzled by your report.
is it possible that an error which is not a TerminalError can happen while the Ready condition still gets set to True? What is the intention of when .status.observedGeneration should get updated? Would it make sense to simply always update it since in the most basic sense the generation has been observed? Or perhaps it can look at the conditions ObservedGenerations to determine whether it has been observed "enough" to ensure consistency vs looking at the transient result of the individual reconciliation?
separately, would it be ok to consider my proposed change independently of the above, as it feels more correct to rely on the Ready condition ObservedGeneration as it is intimately tied to the Ready condition status which is also being checked for dependencies?
trying to translate the intention of that if condition into status conditions, perhaps if Ready==true or Stalled==true at that generation, then that generation has been observed?
ok, so it looks like i was hitting #925 which was still allowing the Ready
condition to be set to true, even though Reconciling
was also true. Should Ready
take Reconciling
into account?
this seems pretty nice, wonder if it can be integrated? https://github.com/fluxcd/pkg/blob/60815565aaaa9bc92edc296a107288089764a384/runtime/reconcile/result.go#L89
I think we need a test for when drift correction is in loop that should used the kstatus verifier that catches any miss configuration https://github.com/fluxcd/kustomize-controller/blob/e9f5628eccbfbc722a7637ecbf7f66580e2e4416/internal/controller/kustomization_wait_test.go#L135
The checks and logic we use for the Ready/Stalled/Reconciling is here https://github.com/fluxcd/pkg/blob/60815565aaaa9bc92edc296a107288089764a384/runtime/conditions/check/fail.go
I think this will fix it https://github.com/fluxcd/helm-controller/pull/885
@seaneagan with #885 merged I think this will solve your issue. Ready will reflect now that drift detection is running without reaching correction.
It appears
.status.observedGeneration
does not reliably get updated after successful helm release upgrades. I can work on filing a separate bug for this, need to find time to come up with a minimal re-production, seems possible an issue could have been introduced in https://github.com/fluxcd/helm-controller/pull/825/commits/48cad6838633344b70e50670c791ea3399097dd5However, when checking dependencies it seems preferable to rely solely on the
Ready
condition, including both itsStatus
andObservedGeneration
and ignore.Status.ObservedGeneration
. Which then in turn has the side effect of avoiding the above mentioned bug.https://github.com/fluxcd/helm-controller/blob/5e760db4a83ffd384012c1b47e4615cc07d4081d/internal/controller/helmrelease_controller.go#L588