maorfr / helm-backup

Helm plugin which performs backup/restore of releases in a namespace to/from a file
Other
83 stars 18 forks source link

Couldn't restore due to config map conflict #9

Open EvgeniGordeev opened 5 years ago

EvgeniGordeev commented 5 years ago

Scenario:

  1. helm backup --file dev.tgz dev in old cluster
  2. Move secrets from one cluster to another
  3. helm backup --restore --file dev.tgz dev in old cluster

Output:

19/08/01 14:56:03 applying backup data to tiller (this command will fail if releases exist)
2019/08/01 15:28:47 Error: command execution failed: [kubectl --namespace kube-system apply -f restore/manifests.yaml]
2019/08/01 15:28:47 configmap/XXX.v1 created
configmap/XXX.v10 created
... many many lines with configmap created
configmap/XXX.v99 created
Error from server (Conflict): Operation cannot be fulfilled on configmaps "XXX.v1": the object has been modified; please apply your changes to the latest version and try again
... many conflict errors.

Although all releases were created and helm ls confirmed it, no pods started.

maorfr commented 5 years ago

Hey, Why is step 2 needed?

EvgeniGordeev commented 5 years ago

Our helm releases depend on secret objects - just to make sure that services will start correctly. I can remove this step to avoid the confusion since it's not related to this project.

maorfr commented 5 years ago

Are there any configmaps in kube-system that this may have conflicted with?

EvgeniGordeev commented 5 years ago

It's a fresh EKS cluster with 2 releases in it but in a different namespace (NB: I'm trying to restore dev namespace):

$ helm ls
NAME                    REVISION    UPDATED                     STATUS      CHART                       APP VERSION NAMESPACE  
kubernetes-dashboard    1           Thu Aug  1 22:46:21 2019    DEPLOYED    kubernetes-dashboard-1.2.0  1.10.1      kube-system
kubeservis-core         1           Thu Aug  1 22:46:18 2019    DEPLOYED    kubeservis-core-0.1.0                   kubeservis 

all configmaps are in not in dev namespace:

$ kubectl get configmaps -A
NAMESPACE     NAME                                       DATA   AGE
kube-system   aws-auth                                   2      23h
kube-system   coredns                                    1      24h
kube-system   extension-apiserver-authentication         6      24h
kube-system   kube-proxy                                 1      24h
kube-system   kube-proxy-config                          1      24h
kube-system   kubernetes-dashboard.v1                    1      14h
kube-system   kubeservis-core.v1                         1      14h
kubeservis    cluster-autoscaler-status                  1      14h
kubeservis    ingress-controller-leader-kubeservis       0      23h
kubeservis    kubeservis-core-nginx-ingress-controller   4      14h
kubeservis    kubeservis-core-prometheus-adapter         1      14h
kubeservis    kubeservis-core-prometheus-server          3      14h
EvgeniGordeev commented 5 years ago

Based on the output from helm backup --restore --file dev.tgz dev it was complaining about configmaps specifically in dev ns.

BTW: is there a way to enable progress logging? When I ran the restore command the same message was hanging there for 30+ minutes before something came to stdout:


2019/08/01 14:56:03 applying backup data to tiller (this command will fail if releases exist)
2019/08/01 15:28:47 Error: command execution failed: [kubectl --namespace kube-system apply -f restore/manifests.yaml]```
maorfr commented 5 years ago

progress logging sounds cool! is this something you want to try to tackle?

back to the problem-

helm backup --file dev.tgz dev

this will backup ConfigMaps in kube-system that represent a release in the dev namespace. when restoring, it is expected that the ConfigMaps will be created in kube-system and that the releases will be created in dev.

so, again, i would make sure that there are no ConfigMaps in kube-system that this may conflict with.

another thing that may be problematic is - helm-backup does not clean the ConfigMaps it get as a backup (they remain with all the data from the old cluster).

can you try to do some "cleanup" between the backup and the restore and see if that solves the problem?