rancher / turtles

Rancher CAPI extension
https://turtles.docs.rancher.com
Apache License 2.0
51 stars 16 forks source link

Check agent deployment based on `Ready` condition #591

Closed Danil-Grigorev closed 2 months ago

Danil-Grigorev commented 3 months ago

What this PR does / why we need it:

status.agentDeployed and a corresponding condition in the management v3 cluster is not reflecting if the agent manifests are actually there at all times. If the old manifests are getting removed by cluster re-import or migration, new manifests can be purged by this as well. Ready condition should be better reflecting if turtles need to stop applying manifests.

Rancher is performing cleanup job with: https://github.com/rancher/rancher/blob/1c2b815c8dd66e141b21630904127adc8e03c3de/pkg/controllers/management/usercontrollers/controller.go#L289

We need to check for the job existence before performing import operation, and wait for the job and previous manifests to be gone if those are present.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged): Fixes #587

Special notes for your reviewer:

Checklist:

Danil-Grigorev commented 3 months ago

Seems like neither status.ready is reflecting the state of the agent. Only condition for provisioningv1 and in managementv3 seems to be up-to-date:

  - lastUpdateTime: "2024-07-08T13:41:50Z"
    message: Cluster agent is not connected
    reason: Disconnected
    status: "False"
    type: Ready
  fleetWorkspaceName: creategitops-lfbrqx
  observedGeneration: 2
  ready: true