Lirt / velero-plugin-for-openstack

Openstack Cinder, Manila and Swift plugin for Velero backups
MIT License
26 stars 13 forks source link

Velero delete "backup failed", It doesn't delete volume backup associated #116

Open duy1600 opened 3 weeks ago

duy1600 commented 3 weeks ago

Describe the bug

Steps to reproduce the behavior

  1. Create BackupStorageLocation with fake s3 credentials
  2. Create a Velero backup all with pv/pvc using "backup" as method for volumeSnapshotLocation
  3. Velero delete backup failed
  4. Check on Openstack, the volume backup not delete

Expected behavior Delete backup "failed" should remove volume backup associated

Used versions

Lirt commented 2 weeks ago

Hello @duy1600,

I don't understand this expectation using first step for reproduction.

Create BackupStorageLocation with fake s3 credentials.

For Openstack with Swift you have to configure the authentication file or environment variables with Openstack RC or S3 (if your Swift has S3 api enabled). The credentials cannot be fake. If you cannot connect to Swift, Init will fail, you also cannot save backup contents and then you also cannot delete failed backup (backup object is created, but deletion will fail). If somehow the volume started to backup, you can only force-delete and that means you will have orphaned volume backup left.

For curiosity I tried to create BSL and VSL. I tried with BSL that had wrong credentials and VSL that had correct ones. Velero log was reporting plugin process exited, backup failed, Authentication failed. No volume backup was created as the code couldn't get to this point because it didn't pass authentication. At the same time the backup part couldn't start and main backup part is the one that will find PVCs and call volume backups.

velero-544b5979f6-vpf9x velero time="2024-08-29T16:08:51Z" level=info msg="ObjectStore.Init called" backup=kube-system/lirt-test-1 cmd=/plugins/velero-plugin-for-openstack config="map[bucket:velero-backup-redacted-location cloud:redacted-location prefix:list-cluster]" logSource="/go/src/github.com/Lirt/velero-plugin-for-openstack/src/swift/object_store.go:38" pluginName=velero-plugin-for-openstack
velero-544b5979f6-vpf9x velero time="2024-08-29T16:08:51Z" level=info msg="Authentication will be done for cloud redacted-location" backup=kube-system/lirt-test-1 cmd=/plugins/velero-plugin-for-openstack logSource="/go/src/github.com/Lirt/velero-plugin-for-openstack/src/utils/auth.go:33" pluginName=velero-plugin-for-openstack
velero-544b5979f6-vpf9x velero time="2024-08-29T16:08:51Z" level=info msg="Trying to authenticate against OpenStack using environment variables (including application credentials) or using files ~/.config/openstack/clouds.yaml, /etc/openstack/clouds.yaml and ./clouds.yaml" backup=kube-system/lirt-test-1 cmd=/plugins/velero-plugin-for-openstack logSource="/go/src/github.com/Lirt/velero-plugin-for-openstack/src/utils/auth.go:68" pluginName=velero-plugin-for-openstack
velero-544b5979f6-vpf9x velero time="2024-08-29T16:08:52Z" level=info msg="plugin process exited" backup=kube-system/lirt-test-1 cmd=/velero id=266 logSource="pkg/plugin/clientmgmt/process/logrus_adapter.go:80" plugin=/velero
velero-544b5979f6-vpf9x velero time="2024-08-29T16:08:52Z" level=error msg="backup failed" backuprequest=kube-system/lirt-test controller=backup error="rpc error: code = Unknown desc = failed to authenticate against OpenStack in object storage plugin: failed to authenticate: Authentication failed" logSource="pkg/controller/backup_controller.go:288"

From docs:

Can you explain how did you get to this point and how do I exactly reproduce this?

duy1600 commented 2 weeks ago

Sorry about my fault, because it wasn't clear. Let's me explain the first step again

  1. First step, creating BSL with S3 (minio) bucket enable object lock and sse-c according velero-plugin-for-aws
  2. Create a Velero backup all with pv/pvc using "backup" as method for volumeSnapshotLocation - The backup was done, but when velero-plugin-for-aws upload backup manifest to S3, it was failed by some issue with Content-MD5 and object lock. Velero mark the backup failed. However, velero-plugin-for-openstack stared backup volume. It make orphaned volume backup
  3. Velero delete backup failed - Creating DeleteBackupRequest then kubectl apply, the backup failed was delete
  4. Check on Openstack, the volume backup not delete
Lirt commented 2 weeks ago

Thank you for explanation. Yes it makes sense..

I will check if it's possible to overcome this.