vmware-tanzu / velero

Backup and migrate Kubernetes applications and their persistent volumes
https://velero.io
Apache License 2.0
8.58k stars 1.39k forks source link

velero BackupStorageLocation is unavailable due to: BucketRegionError: incorrect region, the bucket is not in 'us-west-2' region #5511

Closed jhuisss closed 1 year ago

jhuisss commented 1 year ago

Describe the bug

I have succeeded in configuring minio object storage server with velero, but when I try to test configuring aws s3, velero BackupStorageLocation CR is unavailable due to: BucketRegionError: incorrect region, the bucket is not in 'us-west-2' region

To Reproduce configuring aws url as s3url, velero BackupStorageLocation CR become unavailable:

# kubectl get bsl -n velero -o yaml
apiVersion: v1
items:
- apiVersion: velero.io/v1
  kind: BackupStorageLocation
  metadata:
    creationTimestamp: "2022-10-28T03:58:38Z"
    generation: 60
    labels:
      component: velero
      kapp.k14s.io/app: "1666929515165133677"
      kapp.k14s.io/association: v1.f9fe7a851b6026fb4d2c9c83393cffbd
    name: default
    namespace: velero
    resourceVersion: "1110231"
    uid: fb815119-4420-41fe-ab4b-a60a070523dc
  spec:
    accessMode: ReadWrite
    backupSyncPeriod: 1m
    config:
      region: us-west-2
      s3ForcePathStyle: "true"
      s3Url: https://s3.amazonaws.com/
    default: true
    objectStorage:
      bucket: xkou-br-tca
    provider: aws
    validationFrequency: 1m
  status:
    lastValidationTime: "2022-10-28T04:59:11Z"
    message: "BackupStorageLocation \"default\" is unavailable: rpc error: code =
      Unknown desc = BucketRegionError: incorrect region, the bucket is not in 'us-west-2'
      region at endpoint ''\n\tstatus code: 301, request id: , host id: "
    phase: Unavailable
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""

but we are pretty sure that the bucket "xkou-br-tca" is in region 'us-west-2'.

also velero pod logs:

time="2022-10-28T05:09:41Z" level=error msg="Error listing backups in backup store" backupLocation=default controller=backup-sync error="rpc error: code = Unknown desc = BucketRegionError: incorrect region, the bucket is not in 'us-west-2' region at endpoint ''\n\tstatus code: 301, request id: , host id: " error.file="/go/src/velero-plugin-for-aws/velero-plugin-for-aws/object_store.go:426" error.function="main.(*ObjectStore).ListCommonPrefixes" logSource="pkg/controller/backup_sync_controller.go:189"
time="2022-10-28T05:09:41Z" level=error msg="fail to validate backup store" backup-storage-location=velero/default controller=backup-storage-location error="rpc error: code = Unknown desc = BucketRegionError: incorrect region, the bucket is not in 'us-west-2' region at endpoint ''\n\tstatus code: 301, request id: , host id: " error.file="/go/src/github.com/vmware-tanzu/velero/pkg/persistence/object_store.go:191" error.function="github.com/vmware-tanzu/velero/pkg/persistence.(*objectBackupStore).IsValid" logSource="pkg/controller/backup_storage_location_controller.go:134"

Expected behavior

BackupStorageLocation is available.

Troubleshooting Information Troubleshooting Information Velero server version: 1.9.2 AWS plugin version: v1.5.1 vSphere plugin version: v1.4.0 Kubernetes: Vanilla (tkgm workload cluster) Kubernetes version:

# kubectl version
Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.2", GitCommit:"092fbfbf53427de67cac1e9fa54aaa09a28371d7", GitTreeState:"clean", BuildDate:"2021-06-16T12:59:11Z", GoVersion:"go1.16.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.8+vmware.2", GitCommit:"d625665700d901401d7504761cf473dd8adfc7b3", GitTreeState:"clean", BuildDate:"2022-07-18T23:04:08Z", GoVersion:"go1.17.11", Compiler:"gc", Platform:"linux/amd64"}
Lyndon-Li commented 1 year ago

The s3Url you specified is not correct. Actually, for AWS s3, you don't need to specify s3Url. Therefore, remove it and let Velero decide it.

jhuisss commented 1 year ago

Hi, thanks for your reply! But still if I don't specify the s3Url, the velero server pod logs: time="2022-11-01T01:57:49Z" level=info msg="Validating BackupStorageLocation" backup-storage-location=velero/default controller=backup-storage-location logSource="pkg/controller/backup_storage_location_controller.go:131" time="2022-11-01T01:57:50Z" level=error msg="fail to validate backup store" backup-storage-location=velero/default controller=backup-storage-location error="rpc error: code = Unknown desc = InvalidAccessKeyId: The AWS Access Key Id you provided does not exist in our records.\n\tstatus code: 403, request id: ZAFN71NKVD9FKCWF, host id: gI/zuN71ux+fjW7PM+gvVLdecT+gj3Rd7zlrpC2G90PbOXhmMSl6KVj13wx8k/tc423bkx/5Ppk=" error.file="/go/src/github.com/vmware-tanzu/velero/pkg/persistence/object_store.go:191" error.function="github.com/vmware-tanzu/velero/pkg/persistence.(*objectBackupStore).IsValid" logSource="pkg/controller/backup_storage_location_controller.go:134" time="2022-11-01T01:57:50Z" level=info msg="BackupStorageLocation is invalid, marking as unavailable" backup-storage-location=velero/default controller=backup-storage-location logSource="pkg/controller/backup_storage_location_controller.go:110"

$ kubectl get bsl -n velero
NAME      PHASE       LAST VALIDATED   AGE   DEFAULT
default   Available   13m              12d   true
$ kubectl get bsl -n velero -o yaml
apiVersion: v1
items:
- apiVersion: velero.io/v1
  kind: BackupStorageLocation
  metadata:
    annotations:
      kapp.k14s.io/identity: v1;velero/velero.io/BackupStorageLocation/default;velero.io/v1
      kapp.k14s.io/original: '{"apiVersion":"velero.io/v1","kind":"BackupStorageLocation","metadata":{"creationTimestamp":null,"labels":{"component":"velero","kapp.k14s.io/app":"1666156983602966143","kapp.k14s.io/association":"v1.f9fe7a851b6026fb4d2c9c83393cffbd"},"name":"default","namespace":"velero"},"spec":{"accessMode":"ReadWrite","backupSyncPeriod":"1m","config":{"region":"us-west-2","s3ForcePathStyle":"true"},"default":true,"objectStorage":{"bucket":"xkou-br-tca"},"provider":"aws","validationFrequency":"1m"}}'
      kapp.k14s.io/original-diff-md5: c6e94dc94aed3401b5d0f26ed6c0bff3
    creationTimestamp: "2022-10-19T05:23:07Z"
    generation: 3
    labels:
      component: velero
      kapp.k14s.io/app: "1666156983602966143"
      kapp.k14s.io/association: v1.f9fe7a851b6026fb4d2c9c83393cffbd
    name: default
    namespace: velero
    resourceVersion: "5195262"
    uid: 69be4d84-cccd-4860-9ead-cb0e35b56e98
  spec:
    accessMode: ReadWrite
    backupSyncPeriod: 1m
    config:
      region: us-west-2
      s3ForcePathStyle: "true"
    default: true
    objectStorage:
      bucket: xkou-br-tca
    provider: aws
    validationFrequency: 1m
  status:
    lastSyncedTime: "2022-11-01T01:48:51Z"
    lastValidationTime: "2022-11-01T01:48:51Z"
    phase: Available
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""
Lyndon-Li commented 1 year ago

Are you using temporary aws credentials? If so, you need to specify aws_session_token in velero crednetials file

jhuisss commented 1 year ago

yes, it is temporary aws credentials, let me try the aws_session_token in velero crednetials file first.

jhuisss commented 1 year ago

Thanks Li for your quick reply! and the bsl and vsl status look normal when specifying aws_session_token in velero credentials file.

$ kubectl get bsl -n velero -o yaml
apiVersion: v1
items:
- apiVersion: velero.io/v1
  kind: BackupStorageLocation
  metadata:
    generation: 113
    labels:
      component: velero
      kapp.k14s.io/app: "1667269853229087310"
      kapp.k14s.io/association: v1.f9fe7a851b6026fb4d2c9c83393cffbd
    name: default
    namespace: velero
    resourceVersion: "5225643"
    uid: 013af71e-28a4-45b9-9000-91996ba260e8
  spec:
    accessMode: ReadWrite
    backupSyncPeriod: 1m
    config:
      region: us-west-2
      s3ForcePathStyle: "true"
    default: true
    objectStorage:
      bucket: xkou-br-tca
    provider: aws
    validationFrequency: 1m
  status:
    lastSyncedTime: "2022-11-01T03:34:47Z"
    lastValidationTime: "2022-11-01T03:34:50Z"
    phase: Available
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""
$ kubectl get volumesnapshotlocation -n velero -o yaml
apiVersion: v1
items:
- apiVersion: velero.io/v1
  kind: VolumeSnapshotLocation
    generation: 2
    labels:
      component: velero
      kapp.k14s.io/app: "1667269853229087310"
      kapp.k14s.io/association: v1.ad22d5b4912d8c41372647d68d08bbea
    name: default
    namespace: velero
    resourceVersion: "5208192"
    uid: dedb0619-6617-4384-ac62-d57ac005e25f
  spec:
    config:
      bucket: xkou-br-tca
      region: us-west-2
    provider: vsphere
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""

But when I test on velero backup and restore function, I find the pv data is lost after velero restore. Should the aws temp account work just as fine as long-term one?

Lyndon-Li commented 1 year ago

Looks like the provider is incorrect, provider: vsphere

jhuisss commented 1 year ago

Looks like the provider is incorrect, provider: vsphere

We installed velero with velero-plugin-for-vsphere, to hanndle pv backup and restore. So to my understanding, the volume snapshot provider should set to vsphere, not aws, right?

Lyndon-Li commented 1 year ago

If you are a Tanzu user, please open a ticket directly. Here we can only handle Velero problems, for your current problem, which involves vsphere-plugin, we need a joint effort across teams. Thanks for your understanding.

jhuisss commented 1 year ago

Thanks, I'd like to provide more info here: when the bsl and vsl status is normal, I triggered a velero backup create with cmd velero backup create b-nginx-pv --include-namespaces nginx-example, it looks like the backup is successful:

$ velero backup get
NAME            STATUS            ERRORS   WARNINGS   CREATED                         EXPIRES   STORAGE LOCATION   SELECTOR
b-nginx-pv      Completed         0        1          2022-11-01 07:13:29 +0000 UTC   29d       default            <none>

But when I check snapshots CR and uploads CR, there is some error:

$ kubectl get snapshots snap-5d1db395-4ad6-45c2-9b54-9e8363bdf6dc -n nginx-example -oyaml
apiVersion: backupdriver.cnsdp.vmware.com/v1alpha1
kind: Snapshot
metadata:
  creationTimestamp: "2022-11-01T07:13:49Z"
  generation: 1
  labels:
    velero.io/backup-name: b-nginx-pv
    velero.io/exclude-from-backup: "true"
  name: snap-5d1db395-4ad6-45c2-9b54-9e8363bdf6dc
  namespace: nginx-example
  resourceVersion: "5288901"
  uid: 8e7d699c-7e4c-4f34-80d2-178e37ad00b4
spec:
  backupRepository: br-870c699f-4604-43ca-8066-f146063c79d1
  resourceHandle:
    apiGroup: ""
    kind: PersistentVolumeClaim
    name: nginx-logs
status:
  metadata: Cs4LCgpuZ2lueC1sb2dzEgAaDW5naW54LWV4YW1wbGUiACokMzUzYzNkYjktZTJiOC00MzRlLTg5MTktNmQ5NDlkYTUzOGJjMgc1Mjg3NzE3OABCCAjIjIObBhAAWgwKA2FwcBIFbmdpbnhixQIKMGt1YmVjdGwua3ViZXJuZXRlcy5pby9sYXN0LWFwcGxpZWQtY29uZmlndXJhdGlvbhKQAnsiYXBpVmVyc2lvbiI6InYxIiwia2luZCI6IlBlcnNpc3RlbnRWb2x1bWVDbGFpbSIsIm1ldGFkYXRhIjp7ImFubm90YXRpb25zIjp7fSwibGFiZWxzIjp7ImFwcCI6Im5naW54In0sIm5hbWUiOiJuZ2lueC1sb2dzIiwibmFtZXNwYWNlIjoibmdpbngtZXhhbXBsZSJ9LCJzcGVjIjp7ImFjY2Vzc01vZGVzIjpbIlJlYWRXcml0ZU9uY2UiXSwicmVzb3VyY2VzIjp7InJlcXVlc3RzIjp7InN0b3JhZ2UiOiI1ME1pIn19LCJzdG9yYWdlQ2xhc3NOYW1lIjoidnNwaGVyZS1jc2kifX0KYiYKH3B2Lmt1YmVybmV0ZXMuaW8vYmluZC1jb21wbGV0ZWQSA3llc2IrCiRwdi5rdWJlcm5ldGVzLmlvL2JvdW5kLWJ5LWNvbnRyb2xsZXISA3llc2JHCi12b2x1bWUuYmV0YS5rdWJlcm5ldGVzLmlvL3N0b3JhZ2UtcHJvdmlzaW9uZXISFmNzaS52c3BoZXJlLnZtd2FyZS5jb21iQgoodm9sdW1lLmt1YmVybmV0ZXMuaW8vc3RvcmFnZS1wcm92aXNpb25lchIWY3NpLnZzcGhlcmUudm13YXJlLmNvbXIca3ViZXJuZXRlcy5pby9wdmMtcHJvdGVjdGlvbnoAigG4AgoXa3ViZS1jb250cm9sbGVyLW1hbmFnZXISBlVwZGF0ZRoCdjEiCAjIjIObBhAAMghGaWVsZHNWMTr6AQr3AXsiZjptZXRhZGF0YSI6eyJmOmFubm90YXRpb25zIjp7ImY6cHYua3ViZXJuZXRlcy5pby9iaW5kLWNvbXBsZXRlZCI6e30sImY6cHYua3ViZXJuZXRlcy5pby9ib3VuZC1ieS1jb250cm9sbGVyIjp7fSwiZjp2b2x1bWUuYmV0YS5rdWJlcm5ldGVzLmlvL3N0b3JhZ2UtcHJvdmlzaW9uZXIiOnt9LCJmOnZvbHVtZS5rdWJlcm5ldGVzLmlvL3N0b3JhZ2UtcHJvdmlzaW9uZXIiOnt9fX0sImY6c3BlYyI6eyJmOnZvbHVtZU5hbWUiOnt9fX1CAIoBmAEKF2t1YmUtY29udHJvbGxlci1tYW5hZ2VyEgZVcGRhdGUaAnYxIggIyIyDmwYQADIIRmllbGRzVjE6VQpTeyJmOnN0YXR1cyI6eyJmOmFjY2Vzc01vZGVzIjp7fSwiZjpjYXBhY2l0eSI6eyIuIjp7fSwiZjpzdG9yYWdlIjp7fX0sImY6cGhhc2UiOnt9fX1CBnN0YXR1c4oBwAIKGWt1YmVjdGwtY2xpZW50LXNpZGUtYXBwbHkSBlVwZGF0ZRoCdjEiCAjIjIObBhAAMghGaWVsZHNWMTqAAgr9AXsiZjptZXRhZGF0YSI6eyJmOmFubm90YXRpb25zIjp7Ii4iOnt9LCJmOmt1YmVjdGwua3ViZXJuZXRlcy5pby9sYXN0LWFwcGxpZWQtY29uZmlndXJhdGlvbiI6e319LCJmOmxhYmVscyI6eyIuIjp7fSwiZjphcHAiOnt9fX0sImY6c3BlYyI6eyJmOmFjY2Vzc01vZGVzIjp7fSwiZjpyZXNvdXJjZXMiOnsiZjpyZXF1ZXN0cyI6eyIuIjp7fSwiZjpzdG9yYWdlIjp7fX19LCJmOnN0b3JhZ2VDbGFzc05hbWUiOnt9LCJmOnZvbHVtZU1vZGUiOnt9fX1CABJnCg1SZWFkV3JpdGVPbmNlEhMSEQoHc3RvcmFnZRIGCgQ1ME1pGihwdmMtMzUzYzNkYjktZTJiOC00MzRlLTg5MTktNmQ5NDlkYTUzOGJjKgt2c3BoZXJlLWNzaTIKRmlsZXN5c3RlbRopCgVCb3VuZBINUmVhZFdyaXRlT25jZRoRCgdzdG9yYWdlEgYKBDUwTWk=
  phase: UploadFailed
  progress: {}
  snapshotID: pvc:nginx-example/nginx-logs:aXZkOmU2NDg3MTZjLTA0ZTYtNDE0Mi1iY2E4LWM4MWY3NTY4ZWUxNTpjMDI4YTljMy1jYzRhLTRkOTItOGVmYi1kY2M0ZTYxN2Y3ODM
  svcSnapshotName: ""
kubectl get uploads -n velero upload-c028a9c3-cc4a-4d92-8efb-dcc4e617f783 -oyaml
apiVersion: datamover.cnsdp.vmware.com/v1alpha1
kind: Upload
metadata:
  creationTimestamp: "2022-11-01T07:13:55Z"
  generation: 7
  labels:
    velero.io/exclude-from-backup: "true"
  name: upload-c028a9c3-cc4a-4d92-8efb-dcc4e617f783
  namespace: velero
  resourceVersion: "5289184"
  uid: 31abd8fa-903a-443e-bd4c-f4e65f073044
spec:
  backupRepository: br-870c699f-4604-43ca-8066-f146063c79d1
  backupTimestamp: "2022-11-01T07:13:55Z"
  snapshotID: ivd:e648716c-04e6-4142-bca8-c81f7568ee15:c028a9c3-cc4a-4d92-8efb-dcc4e617f783
  snapshotReference: nginx-example/snap-5d1db395-4ad6-45c2-9b54-9e8363bdf6dc
status:
  completionTimestamp: "2022-11-01T07:14:59Z"
  currentBackOff: 4
  message: "Failed to upload snapshot, ivd:e648716c-04e6-4142-bca8-c81f7568ee15:c028a9c3-cc4a-4d92-8efb-dcc4e617f783,
    to durable object storage. Failed to delete peinfo from bucket \"xkou-br-tca\":
    Unable to delete object \"plugins/vsphere-astrolabe-repo/ivd/peinfo/ivd:e648716c-04e6-4142-bca8-c81f7568ee15:c028a9c3-cc4a-4d92-8efb-dcc4e617f783\"
    from bucket \"xkou-br-tca\": InvalidAccessKeyId: The AWS Access Key Id you provided
    does not exist in our records.\n\tstatus code: 403, request id: JWP67JZCSBR8APQ2,
    host id: 7nznIttamWEZqG/M6FyUMLm6eETqBnQuaqIqRRnXh832emRxe1vcDkR8ahmjNfbpQD+9XxP94uE="
  nextRetryTimestamp: "2022-11-01T07:18:59Z"
  phase: UploadError
  processingNode: np-wc2source-6868bf59c4-nh5sd
  progress: {}
  retryCount: 3
  startTimestamp: "2022-11-01T07:13:55Z"

Actually I'm using temporary aws credentials, but I have specified aws_session_token in velero credentials file, and also the bsl is available.

Lyndon-Li commented 1 year ago

As far as we know, vsphere plugin doesn't support aws_session_token so the temporary credential should not work for vsphere plugin. Please refer to this issue .

jhuisss commented 1 year ago

Thanks, I got it. So close the issue, thanks for you great support!