Exponentially backoff on reconciliation failure

flux-iac / tofu-controller

A GitOps OpenTofu and Terraform controller for Flux

Apache License 2.0

1.2k stars 131 forks source link

Currently it is only possible to define a retry with spec.retryInterval and set the value to a static of amount of time.

However, in case of multiple Terraform resources which fails at the same time this can create a lot of noise and will constantly retry based on the provided retry interval. A more graceful approach could be to add exponentially backoff of the retry interval.

A new field named spec.retryStrategy can be introduced and the default value would be StaticInterval to keep it backward compatible, or the user can choose ExponentialBackoff. The first retry would be after 15 seconds and the next one at 30 seconds etc. and then set a maximum requeue time.

flux-iac / tofu-controller

Exponentially backoff on reconciliation failure #1335