vmware-tanzu / helm-charts

Contains Helm charts for Kubernetes related open source tools
https://vmware-tanzu.github.io/helm-charts/
Apache License 2.0
250 stars 362 forks source link

Gitops tools (argocd) removes created backup resources #503

Open luettmattenhas opened 1 year ago

luettmattenhas commented 1 year ago

What steps did you take and what happened:

In our Project we use a gitlab ci pipeline with helm to template helm charts and commit these templated manifests to the git repository. From there they are synced via argocd to the kubernetes. With this commit the helm hooks got removed which until now prevented the pruning of newly created backup resources.

What did you expect to happen:

Created backups won't be deleted via Gitops tools.

Anything else you would like to add:

I guess the deletion could be mitigated with the "useOwnerReferencesInBackup" option in schedules but they have their own pitfalls. E.g. the backups resources will be deleted if the schedule is changed.

I would like be able to add custom annotations to only the backup resources like "argocd.argoproj.io/sync-options: Prune=false".

Environment:

blackpiglet commented 1 year ago

Thanks for reporting, but I'm a little confused as to why the Helm Hook helping to make resources not deleted by ArgoCD.

Is it possible to set Velero resources as an exception to ArgoCD's orphan resources? https://argo-cd.readthedocs.io/en/stable/user-guide/orphaned-resources/

rgarrigue commented 1 year ago

Hi

Got the same issue here : deploying Velero using Helm chart via ArgoCD. When I ask for a new backup, I can see it briefly appearing in ArgoCD UI before it prunes it.

I think the cleanest solution here would be to be able to add the argocd.argoproj.io/sync-options: Prune=false annotation to the backup object. And any others Velero custom resources which are meant to stay around actually.

Another solution could be to remove ArgoCD label, argocd.argoproj.io/instance: my-argocd-app-for-velero

rgarrigue commented 1 year ago

So, found a half working solution for me, adding annotations to the schedule, which are c/c'ed on the backups resources

schedules:
  my-app-backup:
    annotations:
      argocd.argoproj.io/compare-options: IgnoreExtraneous
      argocd.argoproj.io/sync-options: Delete=false,Prune=false

2nd line protects backups from ArgoCD, all good 1st was meant to makes ArgoCD velero app in sync despite the extraneous resource, but seems it doesn't work. Someone on ArgoCD's Slack channel suggested using https://argo-cd.readthedocs.io/en/stable/user-guide/resource_tracking/#additional-tracking-methods-via-an-annotation, I'll try that somewhen

luettmattenhas commented 1 year ago

@blackpiglet I guess it is because how ArgoCD is converting the helm hooks and is using them. The PostSync is executed after all sync hooks completed and were successful, so the sync of our local manifests don't prune the backup resources as they are evaluated after the sync and then never get deleted. https://argo-cd.readthedocs.io/en/stable/user-guide/resource_hooks/

If i understood correctly the orphan resources can only be set on project level which is unfortunately not feasible.

@rgarrigue yes i though about this schedules sync-options solution, too. But it feels not like the cleanest solution, as this option also set the sync-options to the schedules itself, which may not be wanted.

blackpiglet commented 1 year ago

@luettmattenhas Thanks for the feedback.

I also found some information related to the IgnoreExtraneous annotation setting. It has some impact on the schedule resource. https://github.com/argoproj/argo-cd/issues/6324

A workaround is to set argocd.argoproj.io/compare-options: IgnoreExtraneous and possibly argocd.argoproj.io/sync-options: Prune=false on the original resource. However, this leads to the original resource not being updated when the manifest changes and not being pruned when it is deleted from the source manifests, so the original resource eventually becomes a stale and unmanaged one.

If ArgoCD is deployed by ArgoCD-Operator, there is another option: resourceExclusions.

https://argocd-operator.readthedocs.io/en/latest/reference/argocd/#resource-exclusions-example

This is an example.

apiVersion: argoproj.io/v1alpha1
kind: ArgoCD
metadata:
  name: velero-argocd
  namespace: velero
spec:
  resourceExclusions: |
    - apiGroups:
      - velero.io
      kinds:
      - Backup
      - Restore
      clusters:
      - "*"
blackpiglet commented 1 year ago

Just found this issue https://github.com/argoproj/argo-cd/issues/7536. I didn't know that there was a relation between the ArgoCD hook and the Helm hook.

luettmattenhas commented 1 year ago

@blackpiglet Thanks for the feedback.

I stumbled upon this possibility, too. And I think we will look into this and implement it.

But it still seems like a cumbersome approach to fix something we could possible fix with a simple annotation :). Is there any change you want to provide some feature like that in the future, or could we support there with a PR?

blackpiglet commented 1 year ago

@jenting @qiuming-best What do you think about this? As @luettmattenhas mentioned, if we add the removed helm hook annotation back, the problem should also be fixed.

blackpiglet commented 1 year ago

@jenting @luettmattenhas mentioned this commit https://github.com/vmware-tanzu/helm-charts/commit/73f33fe4a19c8a7c7648ba2807e832e05c386a0e, which removed the helm hook from the Velero CRs, such as Schedule. The modification makes GitOps tools, like ArgoCD, delete the Velero Backup generated from Schedule.

This is because the helm hooks are mapped to the ArgoCD hooks. The ArgoCD hooks prevent the Backup from ArgoCD pruning.

We discussed the possible solutions. I suggested configuration from the ArgoCD side but @luettmattenhas thought it's better to add the removed helm hook back.

jenting commented 1 year ago

But I recall even we add the helm annotation back. If the user is using the ArgoCD, the user have to disable the helm annotation hook by setting helmHookAnnotations=false. So, I didn't get why adding the annotation hook back is helpful for ArgoCD + Velero helm chart now 🤔

blackpiglet commented 1 year ago

Looks like different use cases have some conflict. Could you recall why the helm annotation hooks need to be removed for the ArgoCD scenario?

jenting commented 1 year ago

https://github.com/vmware-tanzu/helm-charts/pull/171

This is the 1st PR support disable helm hooks for ArgoCD.