pulumi / pulumi-kubernetes-operator

A Kubernetes Operator that automates the deployment of Pulumi Stacks
Apache License 2.0
226 stars 54 forks source link

Improved Argo CD Integration #752

Open EronWright opened 2 days ago

EronWright commented 2 days ago

While Argo CD doesn't have 'native' support for Pulumi (as opposed to Helm), it should be possible to use Argo to manage Stack objects and thus to delegate to PKO.

Some areas to investigate are:

  1. Stack lifecycle - Argo CD should be able to create and destroy stack objects, with correct ordering of the related objects (KSA, CRB). For example, deleting an object should work gracefully, even when destroyOnFinalize is true.
  2. Resource tracking - resources deployed by Pulumi should appear in the Argo resource view. There's two "topologies" to consider - the physical topology of the Workspace, the Updates - and the logical topology of the managed resources.
  3. Application Sets and multi-cluster deployments. Is it possible to use Pulumi to provision a cluster and register it, and then use an ApplicationSet to apply an application (also using Pulumi) to that cluster?
  4. Argo CD prunes objects that it doesn't recognize, and warns of drift. How to make it understand?
  5. When the Stack is selected in the UI, is it possible to enrich the UI e.g. with a link to the Pulumi Console.
  6. The health and readiness of the Stack object should be known to Argo CD, e.g. so that the sync operation doesn't appear to end prematurely while the Stack is still syncing.
  7. When the user clicks "Sync", force a resync of the Pulumi stack (e.g. using the pulumi.com/reconciliation-request annotation).
  8. More Kubernetes "events" from PKO to show progress.
  9. Provide lua-based "actions" such as forcing a re-sync.
EronWright commented 1 day ago

Some links:

Some findings:

  1. ArgoCD uses foreground deletion by default, and this causes the Workspace to be eagerly deleted during Stack deletion. When destroyOnFinalize is set, a replacement workspace must be provisioned. By using background deletion, the Workspace is left to be garbaged collected. A possible improvement would be to have the stack controller proactively delete the workspace during finalization.
  2. To explicitly trigger a re-sync of the Pulumi program, consider contributing a lua script with a "resync" action, that would set the pulumi.com/reconcilitation-request annotation. To implicitly trigger a re-sync, consider using a hook.
  3. It is possible to disable reconciliation of an Application using skip-reconcile (see ArgoCD Application Pull Controller that replicates status).
  4. It isn't clear whether automatic label/annotation propagation would work well, e.g. https://github.com/argoproj-labs/argocd-operator/pull/414