Azure / orkestra

Orkestra is a cloud-native release orchestration and lifecycle management (LCM) platform for the fine-grained orchestration of inter-dependent helm charts and their dependencies
https://azure.github.io/orkestra
Other
104 stars 16 forks source link

HelmRelease doesn't continually re-reconcile on Argo Workflow Step #213

Closed jonathan-innis closed 3 years ago

jonathan-innis commented 3 years ago

Describe the bug In cases where the interval is long and the HelmRelease doesn't reconcile immediately, it can take up to the interval specified for the helm chart for the HelmRelease to re-reconcile. In some cases where this time > 5m, this can fail the entire workflow.

To Reproduce Steps to reproduce the behavior:

  1. Install the Orkestra chart
  2. Install the ./config/samples/bookinfo.yaml (this issue is intermittent so you may not see this on the first go-around)

Expected behavior To avoid ignoring user-specified reconcile interval, we should install the helm-release and force re-reconcile on a polling interval using the reconcile annotation provided through HelmRelease api

Additional context More information on how to force re-reconcile can be seen within the flux reconcile portion of the Flux CLI https://github.com/fluxcd/flux2/blob/main/cmd/flux/reconcile.go

nitishm commented 3 years ago

So this is a delay caused by the helm-controller reconciliation interval? If so can we always keep it short since we don't necessarily need it to be tweaked by the end user?

jonathan-innis commented 3 years ago

Yes, it is. I think that we should re-reconcile very quickly while we are checking the status with the argo workflow, but not during standard reconciliation. Otherwise, this is going to put a lot of weight on the api server for no reason

jonathan-innis commented 3 years ago

@nitishm I'm picking this up and prioritizing this above testing because otherwise our test cases are going to take forever

nitishm commented 3 years ago

Go for it!!

jonathan-innis commented 3 years ago

As I'm looking at it, it seems to be that we need to force re-reconcile after updating the HelmRepository/pushing the helm charts, I'm gonna try this, but I suspect this will fix it 😄