Lirt / velero-plugin-for-openstack

Openstack Cinder, Manila and Swift plugin for Velero backups
MIT License
27 stars 16 forks source link

Issues accessing the Swift object store #26

Closed albebeto closed 3 years ago

albebeto commented 3 years ago

Hi,

we would like to use Velero on Kubernetes clusters built on our OpenStack cloud platfort, where the object storage is based Ceph radosgw.

We have done the following:

$ velero install --provider "community.openstack.org/openstack" --plugins lirt/velero-plugin-for-openstack:v0.2.1 --bucket velero --no-secret
kubectl -n velero create secret generic openstack-cloud-credentials 
--from-literal OS_REGION_NAME=$OS_REGION_NAME  
--from-literal OS_USER_DOMAIN_NAME=$OS_USER_DOMAIN_NAME 
--from-literal OS_PASSWORD=$OS_PASSWORD 
--from-literal OS_AUTH_URL=$OS_AUTH_URL 
--from-literal OS_USERNAME=$OS_USERNAME 
--from-literal OS_INTERFACE=$OS_INTERFACE 
--from-literal OS_PROJECT_NAME=$OS_PROJECT_NAME 
--from-literal OS_PROJECT_ID=$OS_PROJECT_ID 
--from-literal OS_DOMAIN_NAME=$OS_DOMAIN_NAME 
-o yaml

We have checked that with this set of credentials we manage to authenticate to the project using both openstack and swift client.

kubectl edit deployment velero -n velero
...
    env:
        - name: OS_AUTH_URL
          valueFrom:
            secretKeyRef:
              key: OS_AUTH_URL
              name: openstack-cloud-credentials
       ...

The setup looks OK, however velero backups fail:

$ velero backup create wordpress-backup --include-namespaces wordpress
Backup request "wordpress-backup" submitted successfully.
Run `velero backup describe wordpress-backup` or `velero backup logs wordpress-backup` for more details.

$ velero get backup
NAME               STATUS   ERRORS   WARNINGS   CREATED                         EXPIRES   STORAGE LOCATION   SELECTOR
wordpress-backup   Failed   0        0          2021-07-28 14:48:26 +0000 UTC   29d       default            <none>

$ velero describe backup wordpress-backup
Name:         wordpress-backup
Namespace:    velero
Labels:       velero.io/storage-location=default
Annotations:  velero.io/source-cluster-k8s-gitversion=v1.16.15
              velero.io/source-cluster-k8s-major-version=1
              velero.io/source-cluster-k8s-minor-version=16
Phase:  Failed (run `velero backup logs wordpress-backup` for more information)
Errors:    0
Warnings:  0
Namespaces:
  Included:  wordpress
  Excluded:  <none>
Resources:
  Included:        *
  Excluded:        <none>
  Cluster-scoped:  auto
Label selector:  <none>
Storage Location:  default
Velero-Native Snapshot PVs:  auto
TTL:  720h0m0s
Hooks:  <none>
Backup Format Version:  1.1.0
Started:    2021-07-28 14:48:26 +0000 UTC
Completed:  2021-07-28 14:48:30 +0000 UTC
Expiration:  2021-08-27 14:48:26 +0000 UTC
Total items to be backed up:  19
Items backed up:              19
Velero-Native Snapshots:  2 of 2 snapshots completed successfully (specify --details for more information)

In particular, velero manages to make snapshots of the persistent volumes, but cannot write into the object store.

Among the velero logs we found these messages:

time="2021-07-28T14:48:30Z" level=error msg="Error uploading log file" backup=wordpress-backup bucket=velero error="rpc error: code = Unknown desc = failed to create new object in bucket velero with key backups/wordpress-backup/wordpress-backup-logs.gz: Resource not found" logSource="pkg/persistence/object_store.go:231" prefix=

time="2021-07-28T14:48:30Z" level=error msg="backup failed" controller=backup error="rpc error: code = Unknown desc = failed to create new object in bucket velero with key backups/wordpress-backup/velero-backup.json: Resource not found" key=velero/wordpress-backup logSource="pkg/controller/backup_controller.go:281"

It looks like velero can read the container: we uploaded a test object on the container and found that velero complains about it:

time="2021-07-28T12:35:06Z" level=error msg="Current backup storage locations available/unavailable/unknown: 0/1/0, Backup storage location \"default\" is unavailable: Backup store contains invalid top-level directories: [test.png])" controller=backup-storage-location logSource="pkg/controller/backup_storage_location_controller.go:164"

Can you help us understanding why velero cannot use the object store to complete the backups?

Thank you in advance,

Alberto

Lirt commented 3 years ago

Hello @albebeto,

First of all I'm sorry I missed that you created issue in July.

Let me try to help you, so let's go step by step:

First use the same openstack credentials that you defined in kubernetes secret to make sure you can execute command below and check that there is a bucket with name velero:

$ swift list
velero
$ swift list velero
...

Second in velero log you should see message Authentication successful. This will tell us that authentication is OK also from plugin side.

Noe: you can simplify the deployment yaml to have this to load all env. vars from secret like this:

        envFrom:
        - secretRef:
            name: openstack-cloud-credentials

One reason can be that the Member role is not enough for velero to create and upload objects into a specific bucket. Can you try to create some object in the bucket from swift CLI? Something like this to ensure you have correct rights:

$ echo "hello" > velero-test
$ swift upload velero velero-test       
$ rm velero-test
$ swift download velero velero-test
$ cat velero-test
hello

If nothing helps, can you please recreate the backup, wait until it fails again and then print either whole log as a file or just grep error from the log?

$ kubectl logs velero-<POD_ID> | grep -i -e "error" -e "fail"
Lirt commented 3 years ago

Closed because of no response from issue reporter.