hashicorp / vault-secrets-operator

The Vault Secrets Operator (VSO) allows Pods to consume Vault secrets natively from Kubernetes Secrets.
https://hashicorp.com
Other
436 stars 89 forks source link

add argo.Rollout support for RolloutRestartTarget #702

Closed thyton closed 2 months ago

thyton commented 2 months ago

add argo.Rollout support for RolloutRestartTarget

thyton commented 2 months ago

Thank you both for all the feedback!

Hige commented 3 days ago

Argo doesn't have a resource like "argo.Rollout" in "Kind," there is simply "Rollout." Example: https://github.com/argoproj/argocd-example-apps/blob/master/blue-green/templates/rollout.yaml#L2

Probably because of this, the pod restart does not work for me when secrets in the vault are changed, when I specify "argo.Rollout." Meanwhile, it works with "Deployment."


rolloutRestartTargets:
  - kind: argo.Rollout
    name: {{ .Release.Name }}-app

Do you really need to use argo.Rollout instead of Rollout?

thyton commented 3 days ago

Argo doesn't have a resource like "argo.Rollout" in "Kind," there is simply "Rollout." Example: https://github.com/argoproj/argocd-example-apps/blob/master/blue-green/templates/rollout.yaml#L2

Probably because of this, the pod restart does not work for me when secrets in the vault are changed, when I specify "argo.Rollout." Meanwhile, it works with "Deployment."


rolloutRestartTargets:
  - kind: argo.Rollout
    name: {{ .Release.Name }}-app

Do you really need to use argo.Rollout instead of Rollout?

@Hige We use argo.Rollout ([project/origin name].[kind]) to:

  1. Separate it from the K8s builtin rollout kinds like Deployment.
  2. Avoid future duplications if we support Rollout from a different project that is not argo. (foo.Rollout)

In the implementation of argo.Rollout support, we patch the Rollout object's Spec.RestartAt to indicate the restart request and cover that the object's generation reflecting the patch via integration test.

Note: the Argo Rollouts controller (not VSO) reconciles argo.Rollout objects and is responsible for restarting the pods based on the Rollout object's Spec.RestartAt

Hige commented 2 days ago

Thank you for the detailed response. Why might a pod restart not work? I am using the image:quay.io/argoproj/argo-rollouts:1.6.6 I am applying the following manifest:

---
apiVersion: secrets.hashicorp.com/v1beta1
kind: VaultStaticSecret
metadata:
  name: {{ .Release.Name }}-env
  namespace: {{ .Release.Namespace }}
spec:
  vaultAuthRef: {{ .Release.Name }}-app
  type: kv-v2
  mount: secret
  path: {{ .Release.Name }}/config
  refreshAfter: 10s
  destination:
    name: vault-env
    create: true
    overwrite: true
  rolloutRestartTargets:
    - kind: argo.Rollout
      name: {{ .Release.Name }}-app

I change the env in the vault. I check that they are applied in the secret (this indeed works). I check if the pod restarted, but the pod did not restart and the "RestartAt" annotation was not added.

However, if I use the "Deployment" resource, everything works as expected.

Could you advise what can be done? What to check. There was no information about restarts in the "vault-secrets-operator" logs. Could the problem be on your side?

thyton commented 1 day ago

@Hige Sorry for the experience you're having.

What to check.

Based on this code block, you could check the following:

  1. K8s events via kubectl events I wonder if you see any K8s events indicating the reason as "RolloutRestartFailed" or "RolloutRestartTriggered".
    kubectl events -A --watch -o json | jq 'select(.reason | startswith("RolloutRestart")) | .'
  2. VSO logs Note: Debug mode is required to see "Rollout restart succeeded"
    if errs != nil {
      logger.Error(errs, "Rollout restart failed", "targets", targets)
    } else {
      logger.V(consts.LogLevelDebug).Info("Rollout restart succeeded", "total", len(targets))
    }

I see that you use Helm to deploy VSO, so we could rule out potential issues with the manager RBAC to update argo.Rollout resources for now.

If you don't see that both 1 and 2's outputs indicate the rollout restart triggered, could you possibly redact and share? Otherwise, the Argo Rollouts controller's log is probably the next place to check why the pod did not restart.

I check if the pod restarted, but the pod did not restart and the "RestartAt" annotation was not added.

We don't use the annotation update approach, which is used for Deployment, to trigger the restart for argo.Rollout. We patch the Rollout object's Spec.RestartAt. Here is the implementation for more details.