stefanprodan / timoni

Timoni is a package manager for Kubernetes, powered by CUE and inspired by Helm.
https://timoni.sh
Apache License 2.0
1.53k stars 68 forks source link

Waiting for CRD resource to become ready on timoni apply #391

Open tired-old-man opened 4 months ago

tired-old-man commented 4 months ago

Timoni apply will just pass through while waiting for CRD resource to become ready.

The problem is not that timoni won't wait, but that CRD operator won't be fast enough to set status conditions.

Details: The apply command (cmd/timoni/apply.go) is using ResourceManager to wait for kubernetes objects to become ready. The ResourceManager waiting is defined in fluxcd (ssa/manager_wait.go). It will wait for status "Current". While there are multiple implementations for reading status in fluxcd (deployment, pod, ...), the generic status reader will be used for CRD. This will check for conditions Reconciling and Ready (cli-utils/kstatus/status/status.go:Compute). If no conditions are set, the computed status is "Current". If CRD operator didn't had a chance to set any condition, waiting on apply for CRD resource to become ready will just pass through.

How to replicate: Make a simple timoni module that will deploy single CRD object (assuming operator for CRD will set Reconciling or Ready condition). Execute: timoni apply foo . This will pass through. Then repeat the command again: timoni apply foo . The output will show object to be "unchanged", but this time timoni will wait for the CRD resource to become ready.

The dirty and incorrect fix: Sleep for some time before waiting for kubernetes objects to become ready. time.Sleep(time.Duration(500) * time.Millisecond) after line 326 in cmd/timoni/apply.go before ResourceManager Wait call. Of course this is not correct solution, but will give a chance for operator to set status conditions.

Why not set conditions before: The only other way to set conditions is in MutatingAdmissionWebhook. This is not allowed and it will be ignored by kubernetes.

stefanprodan commented 4 months ago

Yes this is known issue with kstatus, in Flux CRDs for example we've set observedGeneration default to -1 here and this makes kstatus wait for the controller to reconcile the resources and bump the generation to 0.

We'll need to add custom health checks to Timoni's module API and wire them up to kstatus, so that module authors could tell Timoni that a specific Kubernetes Kind is expected to have some status condition.

I've already done this for Kubernetes Jobs, but for CRD we need some CUE definition in the timoni module struct and dynamically create a StatusReader.