openebs / velero-plugin

Velero plugin for backup/restore of OpenEBS cStor volumes
https://docs.openebs.io
Apache License 2.0
63 stars 32 forks source link

velero-plgun 3.5.0 failed to restore from minio: PVC{nginx-example/nginx-logs} is not bounded! #186

Open lizhifengones opened 1 year ago

lizhifengones commented 1 year ago

What steps did you take and what happened: [A clear and concise description of what the bug is, and what commands you ran.]

STATUS=PartiallyFailed

What did you expect to happen: STATUS=Completed

The output of the following commands will help us better understand what's going on: (Pasting long output into a GitHub gist or other Pastebin is fine.)

Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.]


---
apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    component: velero
  name: velero
  namespace: velero
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  labels:
    component: velero
  name: velero
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
  - kind: ServiceAccount
    name: velero
    namespace: velero
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    component: velero
  name: velero
  namespace: velero
spec:
  replicas: 1
  selector:
    matchLabels:
      deploy: velero
  template:
    metadata:
      annotations:
        prometheus.io/path: /metrics
        prometheus.io/port: "8085"
        prometheus.io/scrape: "true"
      creationTimestamp: null
      labels:
        component: velero
        deploy: velero
    spec:
      containers:
      - args:
        - server
        command:
        - /velero
        env:
        - name: AWS_SHARED_CREDENTIALS_FILE
          value: /credentials/cloud
        - name: VELERO_SCRATCH_DIR
          value: /scratch
        - name: VELERO_NAMESPACE
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
        - name: LD_LIBRARY_PATH
          value: /plugins
        image: velero/velero:v1.11.1
        imagePullPolicy: Always
        name: velero
        ports:
        - containerPort: 8085
          name: metrics
          protocol: TCP
        resources:
          limits:
            cpu: "1"
            memory: 256Mi
          requests:
            cpu: 500m
            memory: 128Mi
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /plugins
          name: plugins
        - mountPath: /credentials
          name: cloud-credential
        - mountPath: /scratch
          name: scratch
      dnsPolicy: ClusterFirst
      initContainers:
      - image: velero/velero-plugin-for-aws:v1.7.1
        imagePullPolicy: Always
        name: velero-plugin-for-aws
        resources: {}
        volumeMounts:
        - mountPath: /target
          name: plugins
      - image: openebs/velero-plugin:3.5.0
        imagePullPolicy: IfNotPresent
        name: openebs-velero-plugin
        resources: {}
        volumeMounts:
        - mountPath: /target
          name: plugins
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      volumes:
      - emptyDir: {}
        name: plugins
      - name: cloud-credential
        secret:
          defaultMode: 420
          secretName: cloud-credential
      - emptyDir: {}
        name: scratch

disaster simulation

kubectl delete ns nginx-example

check backups

kubectl exec -n velero $(kubectl get po -n velero -l component=velero -oname | head -n 1)  -it -- /velero backup get

output

NAME                       STATUS            ERRORS   WARNINGS   CREATED                         EXPIRES   STORAGE LOCATION      SELECTOR
inc-nginx-backup-with-pv   Completed         0        0          2023-09-21 08:42:03 +0000 UTC   29d       default-local-minio   <none>
nginx-backup-with-pv       Completed         0        1          2023-09-21 08:30:45 +0000 UTC   29d       default-local-minio   <none>

restore

kubectl exec -n velero $(kubectl get po -n velero -l component=velero -oname | head -n 1)  -it -- /velero restore create --from-backup inc-nginx-backup-with-pv

check result

kubectl exec -n velero $(kubectl get po -n velero -l component=velero -oname | head -n 1)  -it -- /velero restore get  inc-nginx-backup-with-pv-20230921085147

output

NAME                                      BACKUP                     STATUS            STARTED                         COMPLETED                       ERRORS   WARNINGS   CREATED                         SELECTOR
inc-nginx-backup-with-pv-20230921085147   inc-nginx-backup-with-pv   PartiallyFailed   2023-09-21 08:51:47 +0000 UTC   2023-09-21 09:00:08 +0000 UTC   1        2          2023-09-21 08:51:47 +0000 UTC   <none>
kubectl exec -n velero $(kubectl get po -n velero -l component=velero -oname | head -n 1)  -it -- /velero restore logs inc-nginx-backup-with-pv-20230921085147

output

time="2023-09-21T08:51:47Z" level=info msg="starting restore" logSource="pkg/controller/restore_controller.go:458" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T08:51:47Z" level=info msg="Starting restore of backup velero/inc-nginx-backup-with-pv" logSource="pkg/restore/restore.go:396" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T08:51:47Z" level=info msg="Resource 'persistentvolumes' will be restored at cluster scope" logSource="pkg/restore/restore.go:2030" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T08:51:47Z" level=info msg="Resource 'persistentvolumeclaims' will be restored into namespace 'nginx-example'" logSource="pkg/restore/restore.go:2028" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T08:51:47Z" level=info msg="Resource 'serviceaccounts' will be restored into namespace 'nginx-example'" logSource="pkg/restore/restore.go:2028" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T08:51:47Z" level=info msg="Resource 'configmaps' will be restored into namespace 'nginx-example'" logSource="pkg/restore/restore.go:2028" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T08:51:47Z" level=info msg="Resource 'pods' will be restored into namespace 'nginx-example'" logSource="pkg/restore/restore.go:2028" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T08:51:47Z" level=info msg="Resource 'replicasets.apps' will be restored into namespace 'nginx-example'" logSource="pkg/restore/restore.go:2028" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T08:51:47Z" level=info msg="Skipping restore of resource because it cannot be resolved via discovery" logSource="pkg/restore/restore.go:1941" resource=clusterclasses.cluster.x-k8s.io restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T08:51:47Z" level=info msg="Resource 'endpoints' will be restored into namespace 'nginx-example'" logSource="pkg/restore/restore.go:2028" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T08:51:47Z" level=info msg="Resource 'services' will be restored into namespace 'nginx-example'" logSource="pkg/restore/restore.go:2028" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T08:51:47Z" level=info msg="Resource 'deployments.apps' will be restored into namespace 'nginx-example'" logSource="pkg/restore/restore.go:2028" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T08:51:47Z" level=info msg="Resource 'endpointslices.discovery.k8s.io' will be restored into namespace 'nginx-example'" logSource="pkg/restore/restore.go:2028" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T08:51:47Z" level=info msg="Skipping restore of resource because the restore spec excludes it" logSource="pkg/restore/restore.go:1958" resource=events restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T08:51:47Z" level=info msg="Skipping restore of resource because it cannot be resolved via discovery" logSource="pkg/restore/restore.go:1941" resource=clusterbootstraps.run.tanzu.vmware.com restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T08:51:47Z" level=info msg="Skipping restore of resource because it cannot be resolved via discovery" logSource="pkg/restore/restore.go:1941" resource=clusters.cluster.x-k8s.io restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T08:51:47Z" level=info msg="Skipping restore of resource because it cannot be resolved via discovery" logSource="pkg/restore/restore.go:1941" resource=clusterresourcesets.addons.cluster.x-k8s.io restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T08:51:47Z" level=info msg="Getting client for /v1, Kind=PersistentVolume" logSource="pkg/restore/restore.go:918" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T08:51:47Z" level=info msg="Restoring persistent volume from snapshot." logSource="pkg/restore/restore.go:1104" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T08:51:47Z" level=info msg="Initializing velero plugin for CStor" cmd=/plugins/velero-blockstore-openebs logSource="/go/src/github.com/openebs/velero-plugin/pkg/snapshot/snap.go:36" pluginName=velero-blockstore-openebs restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T08:51:47Z" level=info msg="Ip address of velero-plugin server: 10.42.0.51" cmd=/plugins/velero-blockstore-openebs logSource="/go/src/github.com/openebs/velero-plugin/pkg/cstor/cstor.go:208" pluginName=velero-blockstore-openebs restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T08:51:47Z" level=info msg="Setting restApiTimeout to 1m0s" cmd=/plugins/velero-blockstore-openebs logSource="/go/src/github.com/openebs/velero-plugin/pkg/cstor/cstor.go:286" pluginName=velero-blockstore-openebs restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T08:51:47Z" level=info msg="Restoring remote snapshot{inc-nginx-backup-with-pv} for volume:pvc-a117021e-6232-4e85-8e4f-133114466a24" cmd=/plugins/velero-blockstore-openebs logSource="/go/src/github.com/openebs/velero-plugin/pkg/cstor/cstor.go:538" pluginName=velero-blockstore-openebs restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T08:51:47Z" level=info msg="Reading from {ones-backup/backups/inc-nginx-backup-with-pv/ones-pvc-a117021e-6232-4e85-8e4f-133114466a24-inc-nginx-backup-with-pv.pvc} with provider{aws} to bucket{velero}" cmd=/plugins/velero-blockstore-openebs logSource="/go/src/github.com/openebs/velero-plugin/pkg/clouduploader/operation.go:138" pluginName=velero-blockstore-openebs restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T08:51:47Z" level=info msg="successfully read object{ones-backup/backups/inc-nginx-backup-with-pv/ones-pvc-a117021e-6232-4e85-8e4f-133114466a24-inc-nginx-backup-with-pv.pvc} to {aws}" cmd=/plugins/velero-blockstore-openebs logSource="/go/src/github.com/openebs/velero-plugin/pkg/clouduploader/operation.go:146" pluginName=velero-blockstore-openebs restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T08:51:47Z" level=info msg="Creating namespace=nginx-example" cmd=/plugins/velero-blockstore-openebs logSource="/go/src/github.com/openebs/velero-plugin/pkg/cstor/pvc_operation.go:338" pluginName=velero-blockstore-openebs restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T08:51:47Z" level=info msg="Creating PVC for volumeID:pvc-a117021e-6232-4e85-8e4f-133114466a24 snapshot:inc-nginx-backup-with-pv in namespace=nginx-example" cmd=/plugins/velero-blockstore-openebs logSource="/go/src/github.com/openebs/velero-plugin/pkg/cstor/pvc_operation.go:131" pluginName=velero-blockstore-openebs restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=error msg="CreatePVC returned error=PVC{nginx-example/nginx-logs} is not bounded!" cmd=/plugins/velero-blockstore-openebs logSource="/go/src/github.com/openebs/velero-plugin/pkg/cstor/pv_operation.go:205" pluginName=velero-blockstore-openebs restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Restored 1 items out of an estimated total of 10 (estimate will change throughout the restore)" logSource="pkg/restore/restore.go:669" name=pvc-a117021e-6232-4e85-8e4f-133114466a24 namespace= progress= resource=persistentvolumes restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Getting client for /v1, Kind=PersistentVolumeClaim" logSource="pkg/restore/restore.go:918" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="restore status includes excludes: <nil>" logSource="pkg/restore/restore.go:1189" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Executing item action for persistentvolumeclaims" logSource="pkg/restore/restore.go:1196" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Executing AddPVFromPVCAction" cmd=/velero logSource="pkg/restore/add_pv_from_pvc_action.go:44" pluginName=velero restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Adding PV pvc-a117021e-6232-4e85-8e4f-133114466a24 as an additional item to restore" cmd=/velero logSource="pkg/restore/add_pv_from_pvc_action.go:66" pluginName=velero restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Skipping persistentvolumes/pvc-a117021e-6232-4e85-8e4f-133114466a24 because it's already been restored." logSource="pkg/restore/restore.go:1028" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Executing item action for persistentvolumeclaims" logSource="pkg/restore/restore.go:1196" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Executing ChangePVCNodeSelectorAction" cmd=/velero logSource="pkg/restore/change_pvc_node_selector.go:66" pluginName=velero restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Done executing ChangePVCNodeSelectorAction" cmd=/velero logSource="pkg/restore/change_pvc_node_selector.go:138" pluginName=velero restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Executing item action for persistentvolumeclaims" logSource="pkg/restore/restore.go:1196" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Executing ChangeStorageClassAction" cmd=/velero logSource="pkg/restore/change_storageclass_action.go:68" pluginName=velero restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Done executing ChangeStorageClassAction" cmd=/velero logSource="pkg/restore/change_storageclass_action.go:79" pluginName=velero restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Attempting to restore PersistentVolumeClaim: nginx-logs" logSource="pkg/restore/restore.go:1337" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Restored 2 items out of an estimated total of 10 (estimate will change throughout the restore)" logSource="pkg/restore/restore.go:669" name=nginx-logs namespace=nginx-example progress= resource=persistentvolumeclaims restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Getting client for /v1, Kind=ServiceAccount" logSource="pkg/restore/restore.go:918" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="restore status includes excludes: <nil>" logSource="pkg/restore/restore.go:1189" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Executing item action for serviceaccounts" logSource="pkg/restore/restore.go:1196" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Executing ServiceAccountAction" cmd=/velero logSource="pkg/restore/service_account_action.go:47" pluginName=velero restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Done executing ServiceAccountAction" cmd=/velero logSource="pkg/restore/service_account_action.go:78" pluginName=velero restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Attempting to restore ServiceAccount: default" logSource="pkg/restore/restore.go:1337" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Restored 3 items out of an estimated total of 10 (estimate will change throughout the restore)" logSource="pkg/restore/restore.go:669" name=default namespace=nginx-example progress= resource=serviceaccounts restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Getting client for /v1, Kind=ConfigMap" logSource="pkg/restore/restore.go:918" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="restore status includes excludes: <nil>" logSource="pkg/restore/restore.go:1189" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Attempting to restore ConfigMap: kube-root-ca.crt" logSource="pkg/restore/restore.go:1337" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Restored 4 items out of an estimated total of 10 (estimate will change throughout the restore)" logSource="pkg/restore/restore.go:669" name=kube-root-ca.crt namespace=nginx-example progress= resource=configmaps restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Getting client for /v1, Kind=Pod" logSource="pkg/restore/restore.go:918" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="restore status includes excludes: <nil>" logSource="pkg/restore/restore.go:1189" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Executing item action for pods" logSource="pkg/restore/restore.go:1196" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Executing AddPVCFromPodAction" cmd=/velero logSource="pkg/restore/add_pvc_from_pod_action.go:44" pluginName=velero restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Adding PVC nginx-example/nginx-logs as an additional item to restore" cmd=/velero logSource="pkg/restore/add_pvc_from_pod_action.go:58" pluginName=velero restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Skipping persistentvolumeclaims/nginx-example/nginx-logs because it's already been restored." logSource="pkg/restore/restore.go:1028" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Executing item action for pods" logSource="pkg/restore/restore.go:1196" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Executing ChangeImageNameAction" cmd=/velero logSource="pkg/restore/change_image_name_action.go:68" pluginName=velero restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Done executing ChangeImageNameAction" cmd=/velero logSource="pkg/restore/change_image_name_action.go:81" pluginName=velero restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Executing item action for pods" logSource="pkg/restore/restore.go:1196" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Executing InitRestoreHookPodAction" cmd=/velero logSource="pkg/restore/init_restorehook_pod_action.go:49" pluginName=velero restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Pod nginx-example/nginx-deployment-79bcd4b657-wq6t7 has no init.hook.restore.velero.io/container-image annotation, no initRestoreHook in annotation" cmd=/velero logSource="internal/hook/item_hook_handler.go:387" pluginName=velero restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Handling InitRestoreHooks from RestoreSpec" cmd=/velero logSource="internal/hook/item_hook_handler.go:139" pluginName=velero restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Returning pod nginx-example/nginx-deployment-79bcd4b657-wq6t7 with 0 init container(s)" cmd=/velero logSource="internal/hook/item_hook_handler.go:180" pluginName=velero restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Returning from InitRestoreHookPodAction" cmd=/velero logSource="pkg/restore/init_restorehook_pod_action.go:61" pluginName=velero restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Executing item action for pods" logSource="pkg/restore/restore.go:1196" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Executing item action for pods" logSource="pkg/restore/restore.go:1196" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Executing PodVolumeRestoreAction" cmd=/velero logSource="pkg/restore/pod_volume_restore_action.go:71" pluginName=velero restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Done executing PodVolumeRestoreAction" cmd=/velero logSource="pkg/restore/pod_volume_restore_action.go:103" pluginName=velero restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Attempting to restore Pod: nginx-deployment-79bcd4b657-wq6t7" logSource="pkg/restore/restore.go:1337" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="the managed fields for nginx-example/nginx-deployment-79bcd4b657-wq6t7 is patched" logSource="pkg/restore/restore.go:1522" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Restored 5 items out of an estimated total of 10 (estimate will change throughout the restore)" logSource="pkg/restore/restore.go:669" name=nginx-deployment-79bcd4b657-wq6t7 namespace=nginx-example progress= resource=pods restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Getting client for apps/v1, Kind=ReplicaSet" logSource="pkg/restore/restore.go:918" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="restore status includes excludes: <nil>" logSource="pkg/restore/restore.go:1189" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Executing item action for replicasets.apps" logSource="pkg/restore/restore.go:1196" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Executing ChangeImageNameAction" cmd=/velero logSource="pkg/restore/change_image_name_action.go:68" pluginName=velero restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Done executing ChangeImageNameAction" cmd=/velero logSource="pkg/restore/change_image_name_action.go:81" pluginName=velero restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Attempting to restore ReplicaSet: nginx-deployment-79bcd4b657" logSource="pkg/restore/restore.go:1337" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="the managed fields for nginx-example/nginx-deployment-79bcd4b657 is patched" logSource="pkg/restore/restore.go:1522" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Restored 6 items out of an estimated total of 10 (estimate will change throughout the restore)" logSource="pkg/restore/restore.go:669" name=nginx-deployment-79bcd4b657 namespace=nginx-example progress= resource=replicasets.apps restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Getting client for /v1, Kind=Endpoints" logSource="pkg/restore/restore.go:918" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="restore status includes excludes: <nil>" logSource="pkg/restore/restore.go:1189" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Attempting to restore Endpoints: my-nginx" logSource="pkg/restore/restore.go:1337" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="the managed fields for nginx-example/my-nginx is patched" logSource="pkg/restore/restore.go:1522" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Restored 7 items out of an estimated total of 10 (estimate will change throughout the restore)" logSource="pkg/restore/restore.go:669" name=my-nginx namespace=nginx-example progress= resource=endpoints restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Getting client for /v1, Kind=Service" logSource="pkg/restore/restore.go:918" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="restore status includes excludes: <nil>" logSource="pkg/restore/restore.go:1189" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Executing item action for services" logSource="pkg/restore/restore.go:1196" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Attempting to restore Service: my-nginx" logSource="pkg/restore/restore.go:1337" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="the managed fields for nginx-example/my-nginx is patched" logSource="pkg/restore/restore.go:1522" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Restored 8 items out of an estimated total of 10 (estimate will change throughout the restore)" logSource="pkg/restore/restore.go:669" name=my-nginx namespace=nginx-example progress= resource=services restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Getting client for apps/v1, Kind=Deployment" logSource="pkg/restore/restore.go:918" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="restore status includes excludes: <nil>" logSource="pkg/restore/restore.go:1189" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Executing item action for deployments.apps" logSource="pkg/restore/restore.go:1196" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Executing ChangeImageNameAction" cmd=/velero logSource="pkg/restore/change_image_name_action.go:68" pluginName=velero restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Done executing ChangeImageNameAction" cmd=/velero logSource="pkg/restore/change_image_name_action.go:81" pluginName=velero restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Attempting to restore Deployment: nginx-deployment" logSource="pkg/restore/restore.go:1337" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="the managed fields for nginx-example/nginx-deployment is patched" logSource="pkg/restore/restore.go:1522" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Restored 9 items out of an estimated total of 10 (estimate will change throughout the restore)" logSource="pkg/restore/restore.go:669" name=nginx-deployment namespace=nginx-example progress= resource=deployments.apps restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Getting client for discovery.k8s.io/v1, Kind=EndpointSlice" logSource="pkg/restore/restore.go:918" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="restore status includes excludes: <nil>" logSource="pkg/restore/restore.go:1189" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Attempting to restore EndpointSlice: my-nginx-6tswg" logSource="pkg/restore/restore.go:1337" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="the managed fields for nginx-example/my-nginx-6tswg is patched" logSource="pkg/restore/restore.go:1522" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Restored 10 items out of an estimated total of 10 (estimate will change throughout the restore)" logSource="pkg/restore/restore.go:669" name=my-nginx-6tswg namespace=nginx-example progress= resource=endpointslices.discovery.k8s.io restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Waiting for all pod volume restores to complete" logSource="pkg/restore/restore.go:551" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Done waiting for all pod volume restores to complete" logSource="pkg/restore/restore.go:567" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Waiting for all post-restore-exec hooks to complete" logSource="pkg/restore/restore.go:571" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="Done waiting for all post-restore exec hooks to complete" logSource="pkg/restore/restore.go:579" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=error msg="Cluster resource restore error: error executing PVAction for persistentvolumes/pvc-a117021e-6232-4e85-8e4f-133114466a24: rpc error: code = Unknown desc = Failed to read PVC for volumeID=pvc-a117021e-6232-4e85-8e4f-133114466a24 snap=inc-nginx-backup-with-pv: PVC{nginx-example/nginx-logs} is not bounded!" logSource="pkg/controller/restore_controller.go:494" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=warning msg="Namespace nginx-example, resource restore warning: could not restore, PersistentVolumeClaim \"nginx-logs\" already exists. Warning: the in-cluster version is different than the backed-up version." logSource="pkg/controller/restore_controller.go:509" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=warning msg="Namespace nginx-example, resource restore warning: could not restore, ConfigMap \"kube-root-ca.crt\" already exists. Warning: the in-cluster version is different than the backed-up version." logSource="pkg/controller/restore_controller.go:509" restore=velero/inc-nginx-backup-with-pv-20230921085147
time="2023-09-21T09:00:08Z" level=info msg="restore completed" logSource="pkg/controller/restore_controller.go:512" restore=velero/inc-nginx-backup-with-pv-20230921085147
# kubectl describe po -n nginx-example nginx-deployment-79bcd4b657-wq6t7
  Warning  FailedMount  49s                 kubelet, ubuntu.local  Unable to attach or mount volumes: unmounted volumes=[nginx-logs], unattached volumes=[kube-api-access-q8wjp nginx-logs]: timed out waiting for the condition
# kubectl get pvc -n nginx-example
NAME         STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS     AGE
nginx-logs   Bound    pvc-b0e72129-dcc5-48ac-b891-b052bd38ad74   50Mi       RWO            cstor-csi-disk   33m
#  kubectl get pv pvc-b0e72129-dcc5-48ac-b891-b052bd38ad74
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                      STORAGECLASS     REASON   AGE
pvc-b0e72129-dcc5-48ac-b891-b052bd38ad74   50Mi       RWO            Delete           Bound    nginx-example/nginx-logs   cstor-csi-disk            26m
# kubectl get pv pvc-a117021e-6232-4e85-8e4f-133114466a24
Error from server (NotFound): persistentvolumes "pvc-a117021e-6232-4e85-8e4f-133114466a24" not found

The pv(pvc-b0e72129-dcc5-48ac-b891-b052bd38ad74) is dynamically created by the volume provider, and its name is different from what velero requires (pvc-a117021e-6232-4e85-8e4f-133114466a24), causing the recovery to fail? But it looks like all the relationships are correct, why does the pod still fail to start due to mounted volumes?

# kubectl get VolumeSnapshotLocation -n velero
NAME    AGE
minio   179m
# kubectl get VolumeSnapshotLocation -n velero minio -o yaml
apiVersion: velero.io/v1
kind: VolumeSnapshotLocation
metadata:
  name: minio
  namespace: velero
spec:
  config:
    autoSetTargetIP: "true"
    backupPathPrefix: ones-backup
    bucket: velero
    multiPartChunkSize: 64Mi
    namespace: openebs
    prefix: ones
    provider: aws
    region: minio
    restApiTimeout: 1m
    restoreAllIncrementalSnapshots: "true"
    s3ForcePathStyle: "true"
    s3Url: http://minio.velero.svc:9000
  credential:
    key: cloud
    name: cloud-credential
  provider: openebs.io/cstor-blockstore

autoSetTargetIP: "true"

Environment:

lizhifengones commented 1 year ago

How to continue troubleshooting?

lizhifengones commented 1 year ago

After configuring according to https://github.com/openebs/velero-plugin#setting-targetip-in-replica the service is running

This should be a bug? autoSetTargetIP: "true" does not take effect

I have set up all 3 pools

# kubectl get svc -n openebs pvc-b0e72129-dcc5-48ac-b891-b052bd38ad74 -ojsonpath='{.spec.clusterIP}'
10.43.48.33
# kubectl exec cstor-disk-pool-4zzx-64db8bcdc4-7b9l2 -n openebs -c cstor-pool -it bash
root@cstor-disk-pool-4zzx-64db8bcdc4-7b9l2:/# zfs get io.openebs:targetip
NAME PROPERTY VALUE SOURCE
cstor-88ce2e6e-326a-460f-bb9e-89d5d2fd0b11 io.openebs:targetip - -
cstor-88ce2e6e-326a-460f-bb9e-89d5d2fd0b11/pvc-396b5958-1f96-4d73-a44f-aebb86f1fcf1 io.openebs:targetip 10.43.135.194 local
cstor-88ce2e6e-326a-460f-bb9e-89d5d2fd0b11/pvc-601cf7af-8945-4a4a-b68a-afcedfff1011 io.openebs:targetip 10.43.168.62 local
io.openebs:target ip - -
cstor-88ce2e6e-326a-460f-bb9e-89d5d2fd0b11/pvc-b0e72129-dcc5-48ac-b891-b052bd38ad74 io.openebs:targetip default
cstor-88ce2e6e-326a-460f-bb9e-89d5d2fd0b11/pvc-e24f6f79-91be-45a0-826d-c737616ec4f2 io.openebs:targetip 10.43.134.198 local
cstor-88ce2e6e-326a-460f-bb9e-89d5d2fd0b11/pvc-f67805ed-313a-4bbe-91d4-b95399d514c6 io.openebs:targetip 10.43.148.10 local
cstor-88ce2e6e-326a-460f-bb9e-89d5d2fd0b11/pvc-f67805ed-313a-4bbe-91d4-b95399d514c6@snapshot-8729a537-0e4f-40b0-96c0-5320f77a668b io.open ebs:targetip - -
root@cstor-disk-pool-4zzx-64db8bcdc4-7b9l2:/# zfs set io.openebs:targetip=10.43.48.33 cstor-88ce2e6e-326a-460f-bb9e-89d5d2fd0b11/pvc-b0e72129-dcc5-48ac-b89 1-b052bd38ad74
root@cstor-disk-pool-4zzx-64db8bcdc4-7b9l2:/# zfs get io.openebs:targetip
NAME PROPERTY VALUE SOURCE
cstor-88ce2e6e-326a-460f-bb9e-89d5d2fd0b11 io.openebs:targetip - -
cstor-88ce2e6e-326a-460f-bb9e-89d5d2fd0b11/pvc-396b5958-1f96-4d73-a44f-aebb86f1fcf1 io.openebs:targetip 10.43.135.194 local
cstor-88ce2e6e-326a-460f-bb9e-89d5d2fd0b11/pvc-601cf7af-8945-4a4a-b68a-afcedfff1011 io.openebs:targetip 10.43.168.62 local
io.openebs:target ip - -
cstor-88ce2e6e-326a-460f-bb9e-89d5d2fd0b11/pvc-b0e72129-dcc5-48ac-b891-b052bd38ad74 io.openebs:targetip 10.43.48.33 local
cstor-88ce2e6e-326a-460f-bb9e-89d5d2fd0b11/pvc-b0e72129-dcc5-48ac-b891-b052bd38ad74@rebuild_snap io.openebs:targetip - -
cstor-88ce2e6e-326a-460f-bb9e-89d5d2fd0b11/pvc-b0e72129-dcc5-48ac-b891-b052bd38ad74_rebuild_clone io.openebs:targetip default
cstor-88ce2e6e-326a-460f-bb9e-89d5d2fd0b11/pvc-e24f6f79-91be-45a0-826d-c737616ec4f2 io.openebs:targetip 10.43.134.198 local
cstor-88ce2e6e-326a-460f-bb9e-89d5d2fd0b11/pvc-f67805ed-313a-4bbe-91d4-b95399d514c6 io.openebs:targetip 10.43.148.10 local
cstor-88ce2e6e-326a-460f-bb9e-89d5d2fd0b11/pvc-f67805ed-313a-4bbe-91d4-b95399d514c6@snapshot-8729a537-0e4f-40b0-96c0-5320f77a668b io.open ebs:targetip-

but the data is lost (I made sure the data was created...)

# kubectl get po -n nginx-example
NAME                                READY   STATUS    RESTARTS   AGE
nginx-deployment-79bcd4b657-lgrzp   2/2     Running   0          42m
# kubectl exec -n nginx-example $(kubectl get po -n nginx-example -l app=nginx -o name |head -n 1) -it -- cat /var/log/nginx/access.log
Defaulting container name to nginx.
Use 'kubectl describe pod/nginx-deployment-79bcd4b657-lgrzp -n nginx-example' to see all of the containers in this pod.
# kubectl exec -n nginx-example $(kubectl get po -n nginx-example -l app=nginx -o name |head -n 1) -it -- cat /var/log/nginx/error.log
Defaulting container name to nginx.
Use 'kubectl describe pod/nginx-deployment-79bcd4b657-lgrzp -n nginx-example' to see all of the containers in this pod.
lizhifengones commented 1 year ago

I have tried many methods but cannot recover the data of the cstor volume.

scene 1) restore after remove data

"Restored 1 items out of an estimated total of 10 (estimate will change throughout the restore)" logSource="pkg/restore/restore.go:669" name=pvc-b0e72129-dcc5-48ac-b891-b052bd38ad74 namespace= progress= resource=persistentvolumes restore=velero/defaultbackup3-20230921110342
time="2023-09-21T11:03:43Z" level=info msg="Getting client for /v1, Kind=PersistentVolumeClaim" logSource="pkg/restore/restore.go:918" restore=velero/defaultbackup3-20230921110342
time="2023-09-21T11:03:43Z" level=info msg="restore status includes excludes: <nil>" logSource="pkg/restore/restore.go:1189" restore=velero/defaultbackup3-20230921110342
time="2023-09-21T11:03:43Z" level=info msg="Executing item action for persistentvolumeclaims" logSource="pkg/restore/restore.go:1196" restore=velero/defaultbackup3-20230921110342
time="2023-09-21T11:03:43Z" level=info msg="Executing AddPVFromPVCAction" cmd=/velero logSource="pkg/restore/add_pv_from_pvc_action.go:44" pluginName=velero restore=velero/defaultbackup3-20230921110342
time="2023-09-21T11:03:43Z" level=info msg="Adding PV pvc-b0e72129-dcc5-48ac-b891-b052bd38ad74 as an additional item to restore" cmd=/velero logSource="pkg/restore/add_pv_from_pvc_action.go:66" pluginName=velero restore=velero/defaultbackup3-20230921110342
time="2023-09-21T11:03:43Z" level=info msg="Skipping persistentvolumes/pvc-b0e72129-dcc5-48ac-b891-b052bd38ad74 because it's already been restored." logSource="pkg/restore/restore.go:1028" restore=velero/defaultbackup3-20230921110342

scene 2) restore after delete ns

time="2023-09-21T11:20:13Z" level=error msg="Cluster resource restore error: error executing PVAction for persistentvolumes/pvc-b0e72129-dcc5-48ac-b891-b052bd38ad74: rpc error: code = Unknown desc = Failed to read PVC for volumeID=pvc-b0e72129-dcc5-48ac-b891-b052bd38ad74 snap=defaultbackup3: PVC{nginx-example/nginx-logs} is not bounded!" logSource="pkg/controller/restore_controller.go:494" restore=velero/defaultbackup3-20230921111152

I'll try to check the code to see if it's a problem. If possible, could you please tell me if it's my configuration problem?

lizhifengones commented 1 year ago

There is a very conspicuous error in the log:

time="2023-09-21T11:11:52Z" level=info msg="Creating PVC for volumeID:pvc-b0e72129-dcc5-48ac-b891-b052bd38ad74 snapshot:defaultbackup3 in namespace=nginx-example" cmd=/plugins/velero-blockstore-openebs logSource="/go/src/github.com/openebs/velero-plugin/pkg/cstor/pvc_operation.go:131" pluginName=velero-blockstore-openebs restore=velero/defaultbackup3-20230921111152
time="2023-09-21T11:20:13Z" level=error msg="CreatePVC returned error=PVC{nginx-example/nginx-logs} is not bounded!" cmd=/plugins/velero-blockstore-openebs logSource="/go/src/github.com/openebs/velero-plugin/pkg/cstor/pv_operation.go:205" pluginName=velero-blockstore-openebs restore=velero/defaultbackup3-20230921111152
t

When deleting ns or restoring the cluster from scratch, https://github.com/openebs/velero-plugin/blob/cea57783e3ed887d2b7b0e7bafc436ff26bd9a7b/pkg/cstor/pvc_operation.go#L171 since resource recovery is serial, restore pvc first. Since no instance triggers the creation of pv, there will always be a 500s timeout here.