vmware-tanzu / velero-plugin-for-vsphere

Plugin to support Velero on vSphere
Other
59 stars 49 forks source link

Unable to backup PVs with plugin #290

Closed Elegant996 closed 3 years ago

Elegant996 commented 3 years ago

Currently using Velero 1.5.2 with AWS plugin 1.1.0 and vSphere plugin 1.1.0. The backup starts off okay and can be seen in MinIO but when it reaches a PV the below error is encounter:

time="2021-01-15T21:34:15Z" level=info msg="Processing item" backup=velero/default-snap-backup2 logSource="pkg/backup/backup.go:378" name=harbor-postgresql-0 namespace=harbor-system progress= resource=pods
time="2021-01-15T21:34:15Z" level=info msg="Backing up item" backup=velero/default-snap-backup2 logSource="pkg/backup/item_backupper.go:121" name=harbor-postgresql-0 namespace=harbor-system resource=pods
time="2021-01-15T21:34:15Z" level=info msg="Executing custom action" backup=velero/default-snap-backup2 logSource="pkg/backup/item_backupper.go:327" name=harbor-postgresql-0 namespace=harbor-system resource=pods
time="2021-01-15T21:34:15Z" level=info msg="Executing podAction" backup=velero/default-snap-backup2 cmd=/velero logSource="pkg/backup/pod_action.go:51" pluginName=velero
time="2021-01-15T21:34:15Z" level=info msg="Adding pvc data-harbor-postgresql-0 to additionalItems" backup=velero/default-snap-backup2 cmd=/velero logSource="pkg/backup/pod_action.go:67" pluginName=velero
time="2021-01-15T21:34:15Z" level=info msg="Done executing podAction" backup=velero/default-snap-backup2 cmd=/velero logSource="pkg/backup/pod_action.go:77" pluginName=velero
time="2021-01-15T21:34:15Z" level=info msg="Backing up item" backup=velero/default-snap-backup2 logSource="pkg/backup/item_backupper.go:121" name=data-harbor-postgresql-0 namespace=harbor-system resource=persistentvolumeclaims
time="2021-01-15T21:34:15Z" level=info msg="Executing custom action" backup=velero/default-snap-backup2 logSource="pkg/backup/item_backupper.go:327" name=data-harbor-postgresql-0 namespace=harbor-system resource=persistentvolumeclaims
time="2021-01-15T21:34:15Z" level=info msg="Executing PVCAction" backup=velero/default-snap-backup2 cmd=/velero logSource="pkg/backup/backup_pv_action.go:49" pluginName=velero
time="2021-01-15T21:34:15Z" level=info msg="Backing up item" backup=velero/default-snap-backup2 logSource="pkg/backup/item_backupper.go:121" name=pvc-b5e5c618-0a01-4f03-bfcc-2dacb9d0860b namespace= resource=persistentvolumes
time="2021-01-15T21:34:15Z" level=info msg="Executing takePVSnapshot" backup=velero/default-snap-backup2 logSource="pkg/backup/item_backupper.go:405" name=pvc-b5e5c618-0a01-4f03-bfcc-2dacb9d0860b namespace= resource=persistentvolumes
time="2021-01-15T21:34:15Z" level=info msg="label \"topology.kubernetes.io/zone\" is not present on PersistentVolume, checking deprecated label..." backup=velero/default-snap-backup2 logSource="pkg/backup/item_backupper.go:432" name=pvc-b5e5c618-0a01-4f03-bfcc-2dacb9d0860b namespace= persistentVolume=pvc-b5e5c618-0a01-4f03-bfcc-2dacb9d0860b resource=persistentvolumes
time="2021-01-15T21:34:15Z" level=info msg="label \"failure-domain.beta.kubernetes.io/zone\" is not present on PersistentVolume" backup=velero/default-snap-backup2 logSource="pkg/backup/item_backupper.go:435" name=pvc-b5e5c618-0a01-4f03-bfcc-2dacb9d0860b namespace= persistentVolume=pvc-b5e5c618-0a01-4f03-bfcc-2dacb9d0860b resource=persistentvolumes
time="2021-01-15T21:34:15Z" level=info msg="GetVolumeID called with unstructuredPV &{map[apiVersion:v1 kind:PersistentVolume metadata:map[annotations:map[pv.kubernetes.io/provisioned-by:csi.vsphere.vmware.com] creationTimestamp:2020-09-04T20:02:22Z finalizers:[kubernetes.io/pv-protection external-attacher/csi-vsphere-vmware-com] name:pvc-b5e5c618-0a01-4f03-bfcc-2dacb9d0860b resourceVersion:45231468 uid:aa3988a6-8594-4570-9fce-ae9fda460c0f] spec:map[accessModes:[ReadWriteOnce] capacity:map[storage:8Gi] claimRef:map[apiVersion:v1 kind:PersistentVolumeClaim name:data-harbor-postgresql-0 namespace:harbor-system resourceVersion:45231314 uid:b5e5c618-0a01-4f03-bfcc-2dacb9d0860b] csi:map[driver:csi.vsphere.vmware.com fsType:ext4 volumeAttributes:map[storage.kubernetes.io/csiProvisionerIdentity:1597498711248-8081-csi.vsphere.vmware.com type:vSphere CNS Block Volume] volumeHandle:c30194fb-0edf-4307-8d0a-785dfe123609] persistentVolumeReclaimPolicy:Delete storageClassName:vsphere-dsc-nvme volumeMode:Filesystem] status:map[phase:Bound]]}" backup=velero/default-snap-backup2 cmd=/plugins/velero-plugin-for-vsphere logSource="/go/src/github.com/vmware-tanzu/velero-plugin-for-vsphere/pkg/plugin/volume_snapshotter_plugin.go:161" pluginName=velero-plugin-for-vsphere
time="2021-01-15T21:34:15Z" level=warning msg="Explicitly setting empty volume-id to prevent snapshot operations." backup=velero/default-snap-backup2 cmd=/plugins/velero-plugin-for-vsphere logSource="/go/src/github.com/vmware-tanzu/velero-plugin-for-vsphere/pkg/plugin/volume_snapshotter_plugin.go:162" pluginName=velero-plugin-for-vsphere
time="2021-01-15T21:34:15Z" level=info msg="No volume ID returned by volume snapshotter for persistent volume" backup=velero/default-snap-backup2 logSource="pkg/backup/item_backupper.go:458" name=pvc-b5e5c618-0a01-4f03-bfcc-2dacb9d0860b namespace= persistentVolume=pvc-b5e5c618-0a01-4f03-bfcc-2dacb9d0860b resource=persistentvolumes volumeSnapshotLocation=vsl-vsphere
time="2021-01-15T21:34:15Z" level=info msg="Persistent volume is not a supported volume type for snapshots, skipping." backup=velero/default-snap-backup2 logSource="pkg/backup/item_backupper.go:469" name=pvc-b5e5c618-0a01-4f03-bfcc-2dacb9d0860b namespace= persistentVolume=pvc-b5e5c618-0a01-4f03-bfcc-2dacb9d0860b resource=persistentvolumes
time="2021-01-15T21:34:15Z" level=info msg="Executing custom action" backup=velero/default-snap-backup2 logSource="pkg/backup/item_backupper.go:327" name=data-harbor-postgresql-0 namespace=harbor-system resource=persistentvolumeclaims
time="2021-01-15T21:34:15Z" level=info msg="1 errors encountered backup up item" backup=velero/default-snap-backup2 logSource="pkg/backup/backup.go:451" name=harbor-postgresql-0
time="2021-01-15T21:34:15Z" level=error msg="Error backing up item" backup=velero/default-snap-backup2 error="error executing custom action (groupResource=persistentvolumeclaims, namespace=harbor-system, name=data-harbor-postgresql-0): rpc error: code = Unknown desc = Failed during IsObjectBlocked check: Could not translate selfLink  to CRD name" logSource="pkg/backup/backup.go:455" name=harbor-postgresql-0

To be clear, the plugin is able to load the CSI vSphere secret to authenticate with an admin account to vCenter but nothing ever happens.

Command executed is: velero backup create default-snap-backup2 --include-namespaces=harbor-system --snapshot-volumes --volume-snapshot-locations vsl-vsphere

Not sure what else to change at this point, MinIO is receiving data and the plugin can authenticate...

Thanks!

lintongj commented 3 years ago

Would you please try velero backup create default-snap-backup2 --include-namespaces=harbor-system --snapshot-volumes?

Reference: https://github.com/vmware-tanzu/velero-plugin-for-vsphere/blob/main/docs/vanilla.md#backup-vsphere-cns-block-volumes

Elegant996 commented 3 years ago

No dice, same error as before when executing velero backup create default-snap-backup3 --include-namespaces=harbor-system --snapshot-volumes:

time="2021-01-15T23:17:47Z" level=info msg="Processing item" backup=velero/default-snap-backup3 logSource="pkg/backup/backup.go:378" name=harbor-postgresql-0 namespace=harbor-system progress= resource=pods
time="2021-01-15T23:17:47Z" level=info msg="Backing up item" backup=velero/default-snap-backup3 logSource="pkg/backup/item_backupper.go:121" name=harbor-postgresql-0 namespace=harbor-system resource=pods
time="2021-01-15T23:17:47Z" level=info msg="Executing custom action" backup=velero/default-snap-backup3 logSource="pkg/backup/item_backupper.go:327" name=harbor-postgresql-0 namespace=harbor-system resource=pods
time="2021-01-15T23:17:47Z" level=info msg="Executing podAction" backup=velero/default-snap-backup3 cmd=/velero logSource="pkg/backup/pod_action.go:51" pluginName=velero
time="2021-01-15T23:17:47Z" level=info msg="Adding pvc data-harbor-postgresql-0 to additionalItems" backup=velero/default-snap-backup3 cmd=/velero logSource="pkg/backup/pod_action.go:67" pluginName=velero
time="2021-01-15T23:17:47Z" level=info msg="Done executing podAction" backup=velero/default-snap-backup3 cmd=/velero logSource="pkg/backup/pod_action.go:77" pluginName=velero
time="2021-01-15T23:17:47Z" level=info msg="Backing up item" backup=velero/default-snap-backup3 logSource="pkg/backup/item_backupper.go:121" name=data-harbor-postgresql-0 namespace=harbor-system resource=persistentvolumeclaims
time="2021-01-15T23:17:47Z" level=info msg="Executing custom action" backup=velero/default-snap-backup3 logSource="pkg/backup/item_backupper.go:327" name=data-harbor-postgresql-0 namespace=harbor-system resource=persistentvolumeclaims
time="2021-01-15T23:17:47Z" level=info msg="Executing PVCAction" backup=velero/default-snap-backup3 cmd=/velero logSource="pkg/backup/backup_pv_action.go:49" pluginName=velero
time="2021-01-15T23:17:47Z" level=info msg="Backing up item" backup=velero/default-snap-backup3 logSource="pkg/backup/item_backupper.go:121" name=pvc-b5e5c618-0a01-4f03-bfcc-2dacb9d0860b namespace= resource=persistentvolumes
time="2021-01-15T23:17:47Z" level=info msg="Executing takePVSnapshot" backup=velero/default-snap-backup3 logSource="pkg/backup/item_backupper.go:405" name=pvc-b5e5c618-0a01-4f03-bfcc-2dacb9d0860b namespace= resource=persistentvolumes
time="2021-01-15T23:17:47Z" level=info msg="label \"topology.kubernetes.io/zone\" is not present on PersistentVolume, checking deprecated label..." backup=velero/default-snap-backup3 logSource="pkg/backup/item_backupper.go:432" name=pvc-b5e5c618-0a01-4f03-bfcc-2dacb9d0860b namespace= persistentVolume=pvc-b5e5c618-0a01-4f03-bfcc-2dacb9d0860b resource=persistentvolumes
time="2021-01-15T23:17:47Z" level=info msg="label \"failure-domain.beta.kubernetes.io/zone\" is not present on PersistentVolume" backup=velero/default-snap-backup3 logSource="pkg/backup/item_backupper.go:435" name=pvc-b5e5c618-0a01-4f03-bfcc-2dacb9d0860b namespace= persistentVolume=pvc-b5e5c618-0a01-4f03-bfcc-2dacb9d0860b resource=persistentvolumes
time="2021-01-15T23:17:47Z" level=info msg="GetVolumeID called with unstructuredPV &{map[apiVersion:v1 kind:PersistentVolume metadata:map[annotations:map[pv.kubernetes.io/provisioned-by:csi.vsphere.vmware.com] creationTimestamp:2020-09-04T20:02:22Z finalizers:[kubernetes.io/pv-protection external-attacher/csi-vsphere-vmware-com] name:pvc-b5e5c618-0a01-4f03-bfcc-2dacb9d0860b resourceVersion:45231468 uid:aa3988a6-8594-4570-9fce-ae9fda460c0f] spec:map[accessModes:[ReadWriteOnce] capacity:map[storage:8Gi] claimRef:map[apiVersion:v1 kind:PersistentVolumeClaim name:data-harbor-postgresql-0 namespace:harbor-system resourceVersion:45231314 uid:b5e5c618-0a01-4f03-bfcc-2dacb9d0860b] csi:map[driver:csi.vsphere.vmware.com fsType:ext4 volumeAttributes:map[storage.kubernetes.io/csiProvisionerIdentity:1597498711248-8081-csi.vsphere.vmware.com type:vSphere CNS Block Volume] volumeHandle:c30194fb-0edf-4307-8d0a-785dfe123609] persistentVolumeReclaimPolicy:Delete storageClassName:vsphere-dsc-nvme volumeMode:Filesystem] status:map[phase:Bound]]}" backup=velero/default-snap-backup3 cmd=/plugins/velero-plugin-for-vsphere logSource="/go/src/github.com/vmware-tanzu/velero-plugin-for-vsphere/pkg/plugin/volume_snapshotter_plugin.go:161" pluginName=velero-plugin-for-vsphere
time="2021-01-15T23:17:47Z" level=warning msg="Explicitly setting empty volume-id to prevent snapshot operations." backup=velero/default-snap-backup3 cmd=/plugins/velero-plugin-for-vsphere logSource="/go/src/github.com/vmware-tanzu/velero-plugin-for-vsphere/pkg/plugin/volume_snapshotter_plugin.go:162" pluginName=velero-plugin-for-vsphere
time="2021-01-15T23:17:47Z" level=info msg="No volume ID returned by volume snapshotter for persistent volume" backup=velero/default-snap-backup3 logSource="pkg/backup/item_backupper.go:458" name=pvc-b5e5c618-0a01-4f03-bfcc-2dacb9d0860b namespace= persistentVolume=pvc-b5e5c618-0a01-4f03-bfcc-2dacb9d0860b resource=persistentvolumes volumeSnapshotLocation=vsl-vsphere
time="2021-01-15T23:17:47Z" level=info msg="Persistent volume is not a supported volume type for snapshots, skipping." backup=velero/default-snap-backup3 logSource="pkg/backup/item_backupper.go:469" name=pvc-b5e5c618-0a01-4f03-bfcc-2dacb9d0860b namespace= persistentVolume=pvc-b5e5c618-0a01-4f03-bfcc-2dacb9d0860b resource=persistentvolumes
time="2021-01-15T23:17:47Z" level=info msg="Executing custom action" backup=velero/default-snap-backup3 logSource="pkg/backup/item_backupper.go:327" name=data-harbor-postgresql-0 namespace=harbor-system resource=persistentvolumeclaims
time="2021-01-15T23:17:47Z" level=info msg="1 errors encountered backup up item" backup=velero/default-snap-backup3 logSource="pkg/backup/backup.go:451" name=harbor-postgresql-0
time="2021-01-15T23:17:47Z" level=error msg="Error backing up item" backup=velero/default-snap-backup3 error="error executing custom action (groupResource=persistentvolumeclaims, namespace=harbor-system, name=data-harbor-postgresql-0): rpc error: code = Unknown desc = Failed during IsObjectBlocked check: Could not translate selfLink  to CRD name" logSource="pkg/backup/backup.go:455" name=harbor-postgresql-0

Please note that I am currently using Kubernetes 1.20.1. Not sure if that has any impact on the current version of Velero or the vSphere plugin.

Thanks!

deepakkinni commented 3 years ago

Delete the vsl-vsphere, try the command without specifying the vsl-vsphere

Elegant996 commented 3 years ago

Please note that I did, this was the command executed: velero backup create default-snap-backup3 --include-namespaces=harbor-system --snapshot-volumes

deepakkinni commented 3 years ago

kubectl delete volumesnapshotlocation.velero.io -n <velero-namespace> vsl-vsphere Did you delete the vsl-vsphere?

Elegant996 commented 3 years ago

Yes, we just uninstalled the vSphere plugin with its associated CRDs and deleted the entire velero namespace to be sure. The setup was redeployed using the below Helm chart values with no improvement (same error as above):

##
## Configuration settings that directly affect the Velero deployment YAML.
##

# Details of the container image to use in the Velero deployment & daemonset (if
# enabling restic). Required.
image:
  repository: velero/velero
  tag: v1.5.2
  # Digest value example: sha256:d238835e151cec91c6a811fe3a89a66d3231d9f64d09e5f3c49552672d271f38. If used, it will
  # take precedence over the image.tag.
  # digest:
  pullPolicy: IfNotPresent
  # One or more secrets to be used when pulling images
  imagePullSecrets: []
  # - registrySecretName

# Annotations to add to the Velero deployment's pod template. Optional.
#
# If using kube2iam or kiam, use the following annotation with your AWS_ACCOUNT_ID
# and VELERO_ROLE_NAME filled in:
podAnnotations: {}
  #  iam.amazonaws.com/role: "arn:aws:iam::<AWS_ACCOUNT_ID>:role/<VELERO_ROLE_NAME>"

# Additional pod labels for Velero deployment's template. Optional
# ref: https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/
podLabels: {}

# Resource requests/limits to specify for the Velero deployment. Optional.
resources: {}

# Configure the dnsPolicy of the Velero deployment
# See: https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-s-dns-policy
dnsPolicy: ClusterFirst

# Init containers to add to the Velero deployment's pod spec. At least one plugin provider image is required.
initContainers:
  - name: velero-plugin-for-aws
    image: velero/velero-plugin-for-aws:v1.1.0 
    imagePullPolicy: IfNotPresent
    volumeMounts:
      - mountPath: /target
        name: plugins
  - name: velero-plugin-for-vsphere
    image: vsphereveleroplugin/velero-plugin-for-vsphere:1.1.0
    imagePullPolicy: IfNotPresent
    volumeMounts:
      - mountPath: /target
        name: plugins

# SecurityContext to use for the Velero deployment. Optional.
# Set fsGroup for `AWS IAM Roles for Service Accounts`
# see more informations at: https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html
securityContext: {}
  # fsGroup: 1337

# Pod priority class name to use for the Velero deployment. Optional.
priorityClassName: ""

# Tolerations to use for the Velero deployment. Optional.
tolerations: []

# Affinity to use for the Velero deployment. Optional.
affinity: {}

# Node selector to use for the Velero deployment. Optional.
nodeSelector: {}

# Extra volumes for the Velero deployment. Optional.
extraVolumes: []

# Extra volumeMounts for the Velero deployment. Optional.
extraVolumeMounts: []

# Settings for Velero's prometheus metrics. Enabled by default.
metrics:
  enabled: true
  scrapeInterval: 30s

  # Pod annotations for Prometheus
  podAnnotations:
    prometheus.io/scrape: "true"
    prometheus.io/port: "8085"
    prometheus.io/path: "/metrics"

  serviceMonitor:
    enabled: false
    additionalLabels: {}

# Install CRDs as a templates. Enabled by default.
installCRDs: true

# Enable/disable all helm hooks annotations
# You should disable this if using a deploy tool that doesn't support helm hooks,
# such as ArgoCD
enableHelmHooks: true

##
## End of deployment-related settings.
##

##
## Parameters for the `default` BackupStorageLocation and VolumeSnapshotLocation,
## and additional server settings.
##
configuration:
  # Cloud provider being used (e.g. aws, azure, gcp).
  provider: aws

  # Parameters for the `default` BackupStorageLocation. See
  # https://velero.io/docs/v1.5/api-types/backupstoragelocation/
  backupStorageLocation:
    # name is the name of the backup storage location where backups should be stored. If a name is not provided,
    # a backup storage location will be created with the name "default". Optional.
    # name:
    # provider is the name for the backup storage location provider. If omitted
    # `configuration.provider` will be used instead.
    provider: velero.io/aws
    # bucket is the name of the bucket to store backups in. Required.
    bucket: velero
    # caCert defines a base64 encoded CA bundle to use when verifying TLS connections to the provider.
    caCert: *omitted*
    # prefix is the directory under which all Velero data should be stored within the bucket. Optional.
    prefix:
    # Additional provider-specific configuration. See link above
    # for details of required/optional fields for your provider.
    config:
      region: minio
      s3ForcePathStyle: "true"
      s3Url: https://10.0.20.20:9000
    #  kmsKeyId:
    #  resourceGroup:
    #  The ID of the subscription containing the storage account, if different from the cluster’s subscription. (Azure only)
    #  subscriptionId:
    #  storageAccount:
    #  publicUrl:
    #  Name of the GCP service account to use for this backup storage location. Specify the
    #  service account here if you want to use workload identity instead of providing the key file.(GCP only)
    #  serviceAccount:

  # Parameters for the `default` VolumeSnapshotLocation. See
  # https://velero.io/docs/v1.5/api-types/volumesnapshotlocation/
  volumeSnapshotLocation:
    # name is the name of the volume snapshot location where snapshots are being taken. Required.
    # name:
    # provider is the name for the volume snapshot provider. If omitted
    # `configuration.provider` will be used instead.
    # provider:
    # Additional provider-specific configuration. See link above
    # for details of required/optional fields for your provider.
    config: {}
  #    region:
  #    apitimeout:
  #    resourceGroup:
  #    The ID of the subscription where volume snapshots should be stored, if different from the cluster’s subscription. If specified, also requires `configuration.volumeSnapshotLocation.config.resourceGroup`to be set. (Azure only)
  #    subscriptionId:
  #    snapshotLocation:
  #    project:

  # These are server-level settings passed as CLI flags to the `velero server` command. Velero
  # uses default values if they're not passed in, so they only need to be explicitly specified
  # here if using a non-default value. The `velero server` default values are shown in the
  # comments below.
  # --------------------
  # `velero server` default: 1m
  backupSyncPeriod:
  # `velero server` default: 1h
  resticTimeout:
  # `velero server` default: namespaces,persistentvolumes,persistentvolumeclaims,secrets,configmaps,serviceaccounts,limitranges,pods
  restoreResourcePriorities:
  # `velero server` default: false
  restoreOnlyMode:

  # additional key/value pairs to be used as environment variables such as "AWS_CLUSTER_NAME: 'yourcluster.domain.tld'"
  extraEnvVars: {}

  # Comma separated list of velero feature flags. default: empty
  features:

  # Set log-level for Velero pod. Default: info. Other options: debug, warning, error, fatal, panic.
  logLevel:

  # Set log-format for Velero pod. Default: text. Other option: json.
  logFormat:

  # Set true for backup all pod volumes without having to apply annotation on the pod when used restic Default: false. Other option: false.
  defaultVolumesToRestic:

##
## End of backup/snapshot location settings.
##

##
## Settings for additional Velero resources.
##

rbac:
  # Whether to create the Velero role and role binding to give all permissions to the namespace to Velero.
  create: true
  # Whether to create the cluster role binding to give administrator permissions to Velero
  clusterAdministrator: true

# Information about the Kubernetes service account Velero uses.
serviceAccount:
  server:
    create: true
    name: velero
    annotations:

# Info about the secret to be used by the Velero deployment, which
# should contain credentials for the cloud provider IAM account you've
# set up for Velero.
credentials:
  # Whether a secret should be used as the source of IAM account
  # credentials. Set to false if, for example, using kube2iam or
  # kiam to provide IAM credentials for the Velero pod.
  useSecret: true
  # Name of the secret to create if `useSecret` is true and `existingSecret` is empty
  name: cloud-credentials
  # Name of a pre-existing secret (if any) in the Velero namespace
  # that should be used to get IAM account credentials. Optional.
  existingSecret:
  # Data to be stored in the Velero secret, if `useSecret` is true and `existingSecret` is empty.
  # As of the current Velero release, Velero only uses one secret key/value at a time.
  # The key must be named `cloud`, and the value corresponds to the entire content of your IAM credentials file.
  # Note that the format will be different for different providers, please check their documentation.
  # Here is a list of documentation for plugins maintained by the Velero team:
  # [AWS] https://github.com/vmware-tanzu/velero-plugin-for-aws/blob/main/README.md
  # [GCP] https://github.com/vmware-tanzu/velero-plugin-for-gcp/blob/main/README.md
  # [Azure] https://github.com/vmware-tanzu/velero-plugin-for-microsoft-azure/blob/main/README.md
  secretContents:
    cloud: |
      [default]
      aws_access_key_id=*omitted*
      aws_secret_access_key=*omitted*
  # additional key/value pairs to be used as environment variables such as "DIGITALOCEAN_TOKEN: <your-key>". Values will be stored in the secret.
  extraEnvVars: {}
  # Name of a pre-existing secret (if any) in the Velero namespace
  # that will be used to load environment variables into velero and restic.
  # Secret should be in format - https://kubernetes.io/docs/concepts/configuration/secret/#use-case-as-container-environment-variables
  extraSecretRef: ""

# Whether to create backupstoragelocation crd, if false => do not create a default backup location
backupsEnabled: true
# Whether to create volumesnapshotlocation crd, if false => disable snapshot feature
snapshotsEnabled: true

# Whether to deploy the restic daemonset.
deployRestic: false

restic:
  podVolumePath: /var/lib/kubelet/pods
  privileged: false
  # Pod priority class name to use for the Restic daemonset. Optional.
  priorityClassName: ""
  # Resource requests/limits to specify for the Restic daemonset deployment. Optional.
  resources: {}
  # Tolerations to use for the Restic daemonset. Optional.
  tolerations: []

  # Extra volumes for the Restic daemonset. Optional.
  extraVolumes: []

  # Extra volumeMounts for the Restic daemonset. Optional.
  extraVolumeMounts: []

  # Configure the dnsPolicy of the Restic daemonset
  # See: https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-s-dns-policy
  dnsPolicy: ClusterFirst

  # SecurityContext to use for the Velero deployment. Optional.
  # Set fsGroup for `AWS IAM Roles for Service Accounts`
  # see more informations at: https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html
  securityContext: {}
    # fsGroup: 1337

# Backup schedules to create.
# Eg:
# schedules:
#   mybackup:
#     labels:
#       myenv: foo
#     annotations:
#       myenv: foo
#     schedule: "0 0 * * *"
#     template:
#       ttl: "240h"
#       includedNamespaces:
#       - foo
schedules: {}

# Velero ConfigMaps.
# Eg:
# configMaps:
#   restic-restore-action-config:
#     labels:
#       velero.io/plugin-config: ""
#       velero.io/restic: RestoreItemAction
#     data:
#       image: velero/velero-restic-restore-helper:v1.3.1
configMaps: {}

##
## End of additional Velero resource settings.
##
Elegant996 commented 3 years ago

Same issue occurs when installing through velero itself:

velero install `
 --provider aws `
 --plugins velero/velero-plugin-for-aws:v1.1.0 `
 --bucket velero `
 --cacert ca.crt `
 --secret-file minio.secret  `
 --use-volume-snapshots=true `
 --backup-location-config region=default,s3ForcePathStyle="true",s3Url=https://minio.example.com:9000 `
 --snapshot-location-config region=default

Then adding in the plugin:

velero plugin add vsphereveleroplugin/velero-plugin-for-vsphere:1.1.0

Seems like something isn't quite right...

lintongj commented 3 years ago

@Elegant996

Alright, the root cause seems to be different.

time="2021-01-15T21:34:15Z" level=error msg="Error backing up item" backup=velero/default-snap-backup2 error="error executing custom action (groupResource=persistentvolumeclaims, namespace=harbor-system, name=data-harbor-postgresql-0): rpc error: code = Unknown desc = Failed during IsObjectBlocked check: Could not translate selfLink to CRD name" logSource="pkg/backup/backup.go:455" name=harbor-postgresql-0

Would you please share the SelfLink of the PVC object, harbor-system/data-harbor-postgresql-0? It should be available in the backup tarball in your object storage.

Basically, velero-plugin-for-vsphere deliberately blocks backup/restore some certain types of k8s resources, depending on the selfLink of API object. However, selfLink field seems to be unreliable starting from k8s 1.20.

    // DEPRECATED
    // Kubernetes will stop propagating this field in 1.20 release and the field is planned
    // to be removed in 1.21 release.
    // +optional
    SelfLink string `json:"selfLink,omitempty" protobuf:"bytes,4,opt,name=selfLink"`

As a workaround, would you please use k8s 1.17 ~ 1.19 instead? In the meantime, we will be working on fixing this issue to support k8s 1.20 and above better.

Elegant996 commented 3 years ago

We'd have to redeploy on a lower version which is a bit problematic as we can't easily downgrade the cluster (no backups). We were going to switch to a blue/green approach once we had backups working. We'll delay that approach until this is fixed.

Unfortunately, the selfLink field appears to be absent from the PV:

{
  "apiVersion": "v1",
  "kind": "PersistentVolume",
  "metadata": {
    "annotations": {
      "pv.kubernetes.io/provisioned-by": "csi.vsphere.vmware.com"
    },
    "creationTimestamp": "2020-09-04T20:02:22Z",
    "finalizers": [
      "kubernetes.io/pv-protection",
      "external-attacher/csi-vsphere-vmware-com"
    ],
    "name": "pvc-b5e5c618-0a01-4f03-bfcc-2dacb9d0860b",
    "resourceVersion": "45231468",
    "uid": "aa3988a6-8594-4570-9fce-ae9fda460c0f"
  },
  "spec": {
    "accessModes": [
      "ReadWriteOnce"
    ],
    "capacity": {
      "storage": "8Gi"
    },
    "claimRef": {
      "apiVersion": "v1",
      "kind": "PersistentVolumeClaim",
      "name": "data-harbor-postgresql-0",
      "namespace": "harbor-system",
      "resourceVersion": "45231314",
      "uid": "b5e5c618-0a01-4f03-bfcc-2dacb9d0860b"
    },
    "csi": {
      "driver": "csi.vsphere.vmware.com",
      "fsType": "ext4",
      "volumeAttributes": {
        "storage.kubernetes.io/csiProvisionerIdentity": "1597498711248-8081-csi.vsphere.vmware.com",
        "type": "vSphere CNS Block Volume"
      },
      "volumeHandle": "c30194fb-0edf-4307-8d0a-785dfe123609"
    },
    "persistentVolumeReclaimPolicy": "Delete",
    "storageClassName": "vsphere-dsc-nvme",
    "volumeMode": "Filesystem"
  },
  "status": {
    "phase": "Bound"
  }
}

It appears as though k8s 1.20 actually removed the selfLink field as part of the upgrade from 1.19.

Elegant996 commented 3 years ago

Quick update, was able to workaround this issue. Kubernetes 1.20 deprecated selfLink but you can still enable it by disabling the feature gate --feature-gates=RemoveSelfLink=false.

Will leave this issue open as this feature gate will be removed in Kubernetes 1.21 and is only a workaround for 1.20.