Open fsz285 opened 3 years ago
š I got the same problem
i have similar problem: https://github.com/vmware-tanzu/velero/issues/4124
same issue
What steps did you take and what happened: I am trying to backup and restore a container mounting an nfs pv. I am using the aws plugin 1.1.0. I am also using restic, as I couldn't do the volume backups using just the provider. The backup goes through without exceptions, but the restore gets stuck when the nfs volume is restored.
Those are the kubernetes resoucres:
apiVersion: v1 kind: PersistentVolumeClaim metadata: name: ubuntu-pvc-to-distribute namespace: ubuntu spec: accessModes: - ReadWriteOnce resources: requests: storage: 10Gi --- apiVersion: apps/v1 kind: StatefulSet metadata: name: ubuntu-nfs spec: serviceName: ubuntu-nfs replicas: 1 revisionHistoryLimit: 3 selector: matchLabels: role: ubuntu-nfs template: metadata: name: ubuntu-nfs labels: role: ubuntu-nfs spec: containers: - name: nfs image: 'gcr.io/google_containers/volume-nfs:0.8' lifecycle: postStart: exec: command: ["/bin/sh", "-c", "useradd -u 1000 u1000 && mkdir -p /exports/ubuntu && chown -R u1000:u1000 /exports"] preStop: exec: command: - sh - -c - "sleep 15" imagePullPolicy: IfNotPresent ports: - name: nfs containerPort: 2049 - name: mountd containerPort: 20048 - name: rpcbind containerPort: 111 securityContext: privileged: true volumeMounts: - mountPath: /exports name: storage volumes: - name: storage persistentVolumeClaim: claimName: ubuntu-pvc-to-distribute --- apiVersion: v1 kind: Service metadata: name: ubuntu-nfs spec: ports: - name: nfs port: 2049 - name: mountd port: 20048 - name: rpcbind port: 111 selector: role: ubuntu-nfs --- apiVersion: v1 kind: PersistentVolume metadata: name: ubuntu-nfs spec: capacity: storage: '10Gi' accessModes: - ReadWriteMany nfs: server: ubuntu-nfs.ubuntu.svc.cluster.local path: "/ubuntu/" --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: ubuntu-nfs-claim namespace: ubuntu spec: accessModes: - ReadWriteMany storageClassName: "" volumeName: ubuntu-nfs resources: requests: storage: '10Gi' --- kind: Pod apiVersion: v1 metadata: name: ferdi-ubuntu spec: containers: - name: ubuntu image: ubuntu command: ["/bin/bash", "-ec", "while :; do echo '.'; sleep 5 ; done"] volumeMounts: - mountPath: /data name: ubuntu-data volumes: - name: ubuntu-data persistentVolumeClaim: claimName: ubuntu-nfs-claim
What did you expect to happen:
The output of the following commands will help us better understand what's going on: (Pasting long output into a GitHub gist or other pastebin is fine.)
Here's an excerpt from the logs that might be related to the issue.
time="2021-02-11T12:04:23Z" level=debug msg="Looking for path matching glob" backup=velero/test12 controller=pod-volume-backup logSource="pkg/controller/pod_volume_backup_controller.go:218" name=test12-2rkb6 namespace=velero pathGlob="/host_pods/463252ed-4a26-4053-becd-3ee157fc2103/volumes/*/ubuntu-nfs" time="2021-02-11T12:04:23Z" level=debug msg="Found path matching glob" backup=velero/test12 controller=pod-volume-backup logSource="pkg/controller/pod_volume_backup_controller.go:225" name=test12-2rkb6 namespace=velero path="/host_pods/463252ed-4a26-4053-becd-3ee157fc2103/volumes/kubernetes.io~nfs/ubuntu-nfs" time="2021-02-11T12:26:34Z" level=debug msg="Restore's pod ubuntu/ubuntu-nfs-0 not found, not enqueueing." controller=pod-volume-restore error="pod \"ubuntu-nfs-0\" not found" logSource="pkg/controller/pod_volume_restore_controller.go:140" name=test12-20210211132626-pznzj namespace=velero restore=velero/test12-20210211132626
# velero restore describe test12-20210211132626 --details Name: test12-20210211132626 Namespace: velero Labels: <none> Annotations: <none> Phase: InProgress Started: 2021-02-11 13:26:31 +0100 CET Completed: <n/a> Backup: test12 Namespaces: Included: all namespaces found in the backup Excluded: <none> Resources: Included: * Excluded: nodes, events, events.events.k8s.io, backups.velero.io, restores.velero.io, resticrepositories.velero.io Cluster-scoped: auto Namespace mappings: <none> Label selector: <none> Restore PVs: auto Restic Restores: Completed: ubuntu/ubuntu-nfs-0: storage New: ubuntu/ferdi-ubuntu: ubuntu-data
# velero backup describe test12 --details Name: test12 Namespace: velero Labels: app.kubernetes.io/instance=velero app.kubernetes.io/managed-by=Helm app.kubernetes.io/name=velero helm.sh/chart=velero-2.14.8 velero.io/schedule-name=velero-daily velero.io/storage-location=aws Annotations: velero.io/source-cluster-k8s-gitversion=v1.17.12-1+36738515228c42 velero.io/source-cluster-k8s-major-version=1 velero.io/source-cluster-k8s-minor-version=17+ Phase: Completed Errors: 0 Warnings: 0 Namespaces: Included: ubuntu Excluded: <none> Resources: Included: * Excluded: <none> Cluster-scoped: auto Label selector: <none> Storage Location: aws Velero-Native Snapshot PVs: auto TTL: 720h0m0s Hooks: <none> Backup Format Version: 1.1.0 Started: 2021-02-11 13:04:19 +0100 CET Completed: 2021-02-11 13:04:31 +0100 CET Expiration: 2021-03-13 13:04:19 +0100 CET Total items to be backed up: 33 Items backed up: 33 Resource List: apps/v1/ControllerRevision: - ubuntu/ubuntu-nfs-57784f9f9d apps/v1/StatefulSet: - ubuntu/ubuntu-nfs v1/Endpoints: - ubuntu/ubuntu-nfs v1/Event: - ubuntu/ferdi-ubuntu.1662af50542825f8 - ubuntu/ferdi-ubuntu.1662af764d43ac92 - ubuntu/ferdi-ubuntu.1662af764ef4fc54 - ubuntu/ferdi-ubuntu.1662af9589fa411b - ubuntu/ferdi-ubuntu.1662af958c885694 - ubuntu/ferdi-ubuntu.1662af97a456cbc7 - ubuntu/ferdi-ubuntu.1662af9847d967a8 - ubuntu/ferdi-ubuntu.1662af98b17fc4c3 - ubuntu/ferdi-ubuntu.1662af98bc41403e - ubuntu/ferdi-ubuntu.1662af98cb2faa05 - ubuntu/ubuntu-nfs-0.1662af448b709d44 - ubuntu/ubuntu-nfs-0.1662af48571b6e63 - ubuntu/ubuntu-nfs-0.1662af4d7f7578b0 - ubuntu/ubuntu-nfs-0.1662af4d8768186b - ubuntu/ubuntu-nfs-0.1662af4d947d2f41 - ubuntu/ubuntu-nfs.1662af3fdabbc34c - ubuntu/ubuntu-pvc-to-distribute.1662af3fcb4a454c - ubuntu/ubuntu-pvc-to-distribute.1662af3fe3305295 - ubuntu/ubuntu-pvc-to-distribute.1662af3feb1323b3 - ubuntu/ubuntu-pvc-to-distribute.1662af4471a336ff v1/Namespace: - ubuntu v1/PersistentVolume: - pvc-de0f0f15-6818-481b-b317-faa4d9fb8a3d - ubuntu-nfs v1/PersistentVolumeClaim: - ubuntu/ubuntu-nfs-claim - ubuntu/ubuntu-pvc-to-distribute v1/Pod: - ubuntu/ferdi-ubuntu - ubuntu/ubuntu-nfs-0 v1/Secret: - ubuntu/default-token-9kh2k v1/Service: - ubuntu/ubuntu-nfs v1/ServiceAccount: - ubuntu/default Velero-Native Snapshots: <none included> Restic Backups: Completed: ubuntu/ferdi-ubuntu: ubuntu-data ubuntu/ubuntu-nfs-0: storage
Environment:
- Velero version (use
velero version
):Client: Version: v1.5.2 Git commit: e115e5a191b1fdb5d379b62a35916115e77124a4 Server: Version: v1.5.3
velero-plugin-for-aws:v1.1.0
The volume backups are made with restic.
- Velero features (use
velero client config get features
): not set- Kubernetes version (use
kubectl version
):Client Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.2", GitCommit:"c97fe5036ef3df2967d086711e6c0c405941e14b", GitTreeState:"clean", BuildDate:"2019-10-15T19:18:23Z", GoVersion:"go1.12.10", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"17+", GitVersion:"v1.17.12-1+36738515228c42", GitCommit:"36738515228c4274bf0cc42b0b15d2bfedabd85f", GitTreeState:"clean", BuildDate:"2020-09-17T09:01:41Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}
- Cloud provider or hardware configuration: IONOS
- OS (e.g. from
/etc/os-release
): Windows 10 running ubuntu 18.04 via WSL1Vote on this issue!
This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here. Use the "reaction smiley face" up to the right of this comment to vote.
- š for "I would like to see this bug fixed as soon as possible"
- š for "There are more important bugs to focus on right now"
hi,can you tell me how to solve the issue?
What steps did you take and what happened: I am trying to backup and restore a container mounting an nfs pv. I am using the aws plugin 1.1.0. I am also using restic, as I couldn't do the volume backups using just the provider. The backup goes through without exceptions, but the restore gets stuck when the nfs volume is restored. Those are the kubernetes resoucres:
apiVersion: v1 kind: PersistentVolumeClaim metadata: name: ubuntu-pvc-to-distribute namespace: ubuntu spec: accessModes: - ReadWriteOnce resources: requests: storage: 10Gi --- apiVersion: apps/v1 kind: StatefulSet metadata: name: ubuntu-nfs spec: serviceName: ubuntu-nfs replicas: 1 revisionHistoryLimit: 3 selector: matchLabels: role: ubuntu-nfs template: metadata: name: ubuntu-nfs labels: role: ubuntu-nfs spec: containers: - name: nfs image: 'gcr.io/google_containers/volume-nfs:0.8' lifecycle: postStart: exec: command: ["/bin/sh", "-c", "useradd -u 1000 u1000 && mkdir -p /exports/ubuntu && chown -R u1000:u1000 /exports"] preStop: exec: command: - sh - -c - "sleep 15" imagePullPolicy: IfNotPresent ports: - name: nfs containerPort: 2049 - name: mountd containerPort: 20048 - name: rpcbind containerPort: 111 securityContext: privileged: true volumeMounts: - mountPath: /exports name: storage volumes: - name: storage persistentVolumeClaim: claimName: ubuntu-pvc-to-distribute --- apiVersion: v1 kind: Service metadata: name: ubuntu-nfs spec: ports: - name: nfs port: 2049 - name: mountd port: 20048 - name: rpcbind port: 111 selector: role: ubuntu-nfs --- apiVersion: v1 kind: PersistentVolume metadata: name: ubuntu-nfs spec: capacity: storage: '10Gi' accessModes: - ReadWriteMany nfs: server: ubuntu-nfs.ubuntu.svc.cluster.local path: "/ubuntu/" --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: ubuntu-nfs-claim namespace: ubuntu spec: accessModes: - ReadWriteMany storageClassName: "" volumeName: ubuntu-nfs resources: requests: storage: '10Gi' --- kind: Pod apiVersion: v1 metadata: name: ferdi-ubuntu spec: containers: - name: ubuntu image: ubuntu command: ["/bin/bash", "-ec", "while :; do echo '.'; sleep 5 ; done"] volumeMounts: - mountPath: /data name: ubuntu-data volumes: - name: ubuntu-data persistentVolumeClaim: claimName: ubuntu-nfs-claim
What did you expect to happen: The output of the following commands will help us better understand what's going on: (Pasting long output into a GitHub gist or other pastebin is fine.) Here's an excerpt from the logs that might be related to the issue.
time="2021-02-11T12:04:23Z" level=debug msg="Looking for path matching glob" backup=velero/test12 controller=pod-volume-backup logSource="pkg/controller/pod_volume_backup_controller.go:218" name=test12-2rkb6 namespace=velero pathGlob="/host_pods/463252ed-4a26-4053-becd-3ee157fc2103/volumes/*/ubuntu-nfs" time="2021-02-11T12:04:23Z" level=debug msg="Found path matching glob" backup=velero/test12 controller=pod-volume-backup logSource="pkg/controller/pod_volume_backup_controller.go:225" name=test12-2rkb6 namespace=velero path="/host_pods/463252ed-4a26-4053-becd-3ee157fc2103/volumes/kubernetes.io~nfs/ubuntu-nfs" time="2021-02-11T12:26:34Z" level=debug msg="Restore's pod ubuntu/ubuntu-nfs-0 not found, not enqueueing." controller=pod-volume-restore error="pod \"ubuntu-nfs-0\" not found" logSource="pkg/controller/pod_volume_restore_controller.go:140" name=test12-20210211132626-pznzj namespace=velero restore=velero/test12-20210211132626
# velero restore describe test12-20210211132626 --details Name: test12-20210211132626 Namespace: velero Labels: <none> Annotations: <none> Phase: InProgress Started: 2021-02-11 13:26:31 +0100 CET Completed: <n/a> Backup: test12 Namespaces: Included: all namespaces found in the backup Excluded: <none> Resources: Included: * Excluded: nodes, events, events.events.k8s.io, backups.velero.io, restores.velero.io, resticrepositories.velero.io Cluster-scoped: auto Namespace mappings: <none> Label selector: <none> Restore PVs: auto Restic Restores: Completed: ubuntu/ubuntu-nfs-0: storage New: ubuntu/ferdi-ubuntu: ubuntu-data
# velero backup describe test12 --details Name: test12 Namespace: velero Labels: app.kubernetes.io/instance=velero app.kubernetes.io/managed-by=Helm app.kubernetes.io/name=velero helm.sh/chart=velero-2.14.8 velero.io/schedule-name=velero-daily velero.io/storage-location=aws Annotations: velero.io/source-cluster-k8s-gitversion=v1.17.12-1+36738515228c42 velero.io/source-cluster-k8s-major-version=1 velero.io/source-cluster-k8s-minor-version=17+ Phase: Completed Errors: 0 Warnings: 0 Namespaces: Included: ubuntu Excluded: <none> Resources: Included: * Excluded: <none> Cluster-scoped: auto Label selector: <none> Storage Location: aws Velero-Native Snapshot PVs: auto TTL: 720h0m0s Hooks: <none> Backup Format Version: 1.1.0 Started: 2021-02-11 13:04:19 +0100 CET Completed: 2021-02-11 13:04:31 +0100 CET Expiration: 2021-03-13 13:04:19 +0100 CET Total items to be backed up: 33 Items backed up: 33 Resource List: apps/v1/ControllerRevision: - ubuntu/ubuntu-nfs-57784f9f9d apps/v1/StatefulSet: - ubuntu/ubuntu-nfs v1/Endpoints: - ubuntu/ubuntu-nfs v1/Event: - ubuntu/ferdi-ubuntu.1662af50542825f8 - ubuntu/ferdi-ubuntu.1662af764d43ac92 - ubuntu/ferdi-ubuntu.1662af764ef4fc54 - ubuntu/ferdi-ubuntu.1662af9589fa411b - ubuntu/ferdi-ubuntu.1662af958c885694 - ubuntu/ferdi-ubuntu.1662af97a456cbc7 - ubuntu/ferdi-ubuntu.1662af9847d967a8 - ubuntu/ferdi-ubuntu.1662af98b17fc4c3 - ubuntu/ferdi-ubuntu.1662af98bc41403e - ubuntu/ferdi-ubuntu.1662af98cb2faa05 - ubuntu/ubuntu-nfs-0.1662af448b709d44 - ubuntu/ubuntu-nfs-0.1662af48571b6e63 - ubuntu/ubuntu-nfs-0.1662af4d7f7578b0 - ubuntu/ubuntu-nfs-0.1662af4d8768186b - ubuntu/ubuntu-nfs-0.1662af4d947d2f41 - ubuntu/ubuntu-nfs.1662af3fdabbc34c - ubuntu/ubuntu-pvc-to-distribute.1662af3fcb4a454c - ubuntu/ubuntu-pvc-to-distribute.1662af3fe3305295 - ubuntu/ubuntu-pvc-to-distribute.1662af3feb1323b3 - ubuntu/ubuntu-pvc-to-distribute.1662af4471a336ff v1/Namespace: - ubuntu v1/PersistentVolume: - pvc-de0f0f15-6818-481b-b317-faa4d9fb8a3d - ubuntu-nfs v1/PersistentVolumeClaim: - ubuntu/ubuntu-nfs-claim - ubuntu/ubuntu-pvc-to-distribute v1/Pod: - ubuntu/ferdi-ubuntu - ubuntu/ubuntu-nfs-0 v1/Secret: - ubuntu/default-token-9kh2k v1/Service: - ubuntu/ubuntu-nfs v1/ServiceAccount: - ubuntu/default Velero-Native Snapshots: <none included> Restic Backups: Completed: ubuntu/ferdi-ubuntu: ubuntu-data ubuntu/ubuntu-nfs-0: storage
Environment:
- Velero version (use
velero version
):Client: Version: v1.5.2 Git commit: e115e5a191b1fdb5d379b62a35916115e77124a4 Server: Version: v1.5.3
velero-plugin-for-aws:v1.1.0
The volume backups are made with restic.
- Velero features (use
velero client config get features
): not set- Kubernetes version (use
kubectl version
):Client Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.2", GitCommit:"c97fe5036ef3df2967d086711e6c0c405941e14b", GitTreeState:"clean", BuildDate:"2019-10-15T19:18:23Z", GoVersion:"go1.12.10", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"17+", GitVersion:"v1.17.12-1+36738515228c42", GitCommit:"36738515228c4274bf0cc42b0b15d2bfedabd85f", GitTreeState:"clean", BuildDate:"2020-09-17T09:01:41Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}
- Cloud provider or hardware configuration: IONOS
- OS (e.g. from
/etc/os-release
): Windows 10 running ubuntu 18.04 via WSL1Vote on this issue! This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here. Use the "reaction smiley face" up to the right of this comment to vote.
- š for "I would like to see this bug fixed as soon as possible"
- š for "There are more important bugs to focus on right now"
hi,can you tell me how to solve the issue?
This issue seems to be difficult to be solved.
Is the ubuntu-nfs-0 pod running? From the log, it seems like Velero isn't finding that pod running.
In the meantime, we solved the problem (workarounded it, more like). I don't recall the state of the ubuntu pod, but if I remember correctly, the problem was that the order in which restic tried to restore the volumes caused the issues. The pod providing the nfs volume wasn't restored yet, but velero tried to restore the pods which need to consume it. This caused the whole restore to get stuck. The workaround was to explicitly exclude the mounted nfs volumes from the backup.
Am Mo., 9. Jan. 2023 um 17:03 Uhr schrieb Scott Seago < @.***>:
Is the ubuntu-nfs-0 pod running? From the log, it seems like Velero isn't finding that pod running.
ā Reply to this email directly, view it on GitHub https://github.com/vmware-tanzu/velero/issues/3450#issuecomment-1375862158, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABIZLTCDRTN67RZ6RCUVSRTWRQZERANCNFSM4XO7C2HQ . You are receiving this because you authored the thread.Message ID: @.***>
What steps did you take and what happened: I am trying to backup and restore a container mounting an nfs pv. I am using the aws plugin 1.1.0. I am also using restic, as I couldn't do the volume backups using just the provider. The backup goes through without exceptions, but the restore gets stuck when the nfs volume is restored.
Those are the kubernetes resoucres:
What did you expect to happen:
The output of the following commands will help us better understand what's going on: (Pasting long output into a GitHub gist or other pastebin is fine.)
Here's an excerpt from the logs that might be related to the issue.
Environment:
velero version
):The volume backups are made with restic.
velero client config get features
): not setkubectl version
):/etc/os-release
): Windows 10 running ubuntu 18.04 via WSL1Vote on this issue!
This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.