Open mpsOxygen opened 1 year ago
@mpsOxygen
For the usage of s3Url, if it is specified, Velero expects it to include all the info in order to access the object store, for example, s3-<region>.amazonaws.com
, which means the region has been embed into the s3Url.
In another word, if s3Url is not empty, the region specified separately in the BSL will not be honored.
This is the current behavior of the code, could you check if you can modify the s3Url for your environment to include the region info? On the other hand, Restic path for file system backup will be suppressed in the following releases of Velero, so I suggest you to try with Kopia path which will be the default path in the following releases.
If I understand correctly you are saying I should use s3-emcreg.ecs.metaminds.com? I will test that out with the kind cluster.
We made a workaround for the problem using our F5 in order to make de ECS respond to region requests with emcreg. It was a pretty simple iRule, but I feel like region option should be honored. I haven't tested Kopia, but we did test Kasten (which uses Kopia) and had the exact same problem that we solved with the F5 iRule.
@mpsOxygen Could you confirm if setting the s3Url as s3-emcreg.ecs.metaminds.com
works?
I've tried like this and it says no such bucket exists (the bucket name velero does exist on the ECS):
velero install \
--provider aws \
--plugins velero/velero-plugin-for-aws:v1.7.0 \
--bucket velero \
--use-node-agent \
--default-volumes-to-fs-backup \
--secret-file ./velero-creds \
--backup-location-config region=emcreg,s3Url=http://s3-emcreg.ecs.metaminds.com:9021,insecureSkipTLSVerify=true
@mpsOxygen From the error, looks like this time it has connected to the object store service, but the existing bucket doesn't match to the given region.
Could you try the Kopia path by reinstall Velero with below command?
velero install \
--provider aws \
--plugins velero/velero-plugin-for-aws:v1.7.0 \
--bucket velero \
--use-node-agent \
--default-volumes-to-fs-backup \
--secret-file ./velero-creds \
--uploader-type kopia \
--backup-location-config region=emcreg,s3Url=http://s3-emcreg.ecs.metaminds.com:9021,insecureSkipTLSVerify=true
Then run a file system backup, it will go with Kopia path.
@mpsOxygen Please also add the s3ForcePathStyle=true to the BSL config and try both the restic and kopia path.
So for restic path, the installation is as below:
velero install \
--provider aws \
--plugins velero/velero-plugin-for-aws:v1.7.0 \
--bucket velero \
--use-node-agent \
--default-volumes-to-fs-backup \
--secret-file ./velero-creds \
--backup-location-config region=emcreg,s3ForcePathStyle=true,s3Url=http://s3-emcreg.ecs.metaminds.com:9021,insecureSkipTLSVerify=true
For Kopia path, the installation is as below: velero install \
--provider aws \
--plugins velero/velero-plugin-for-aws:v1.7.0 \
--bucket velero \
--use-node-agent \
--default-volumes-to-fs-backup \
--secret-file ./velero-creds \
--uploader-type kopia \
--backup-location-config region=emcreg,s3ForcePathStyle=true,s3Url=http://s3-emcreg.ecs.metaminds.com:9021,insecureSkipTLSVerify=true
velero install \ --provider aws \ --plugins velero/velero-plugin-for-aws:v1.7.0 \ --bucket velero \ --use-node-agent \ --default-volumes-to-fs-backup \ --secret-file ./velero-creds \ --backup-location-config region=emcreg,s3ForcePathStyle=true,s3Url=http://s3-emcreg.ecs.metaminds.com:9021,insecureSkipTLSVerify=true
This one fails with no such bucket:
time="2023-07-05T09:20:52Z" level=error msg="Error listing backups in backup store" backupLocation=velero/default controller=backup-sync error="rpc error: code = Unknown desc = NoSuchBucket: The specified bucket does not exist\n\tstatus code: 404, request id: ac1e420b:188b3cb4945:98d0:1, host id: " error.file="/go/src/velero-plugin-for-aws/velero-plugin-for-aws/object_store.go:426" error.function="main.(*ObjectStore).ListCommonPrefixes" logSource="pkg/controller/backup_sync_controller.go:107"
I've checked the credentials and the bucket with CyberDuck and it's all there.
velero install \ --provider aws \ --plugins velero/velero-plugin-for-aws:v1.7.0 \ --bucket velero \ --use-node-agent \ --default-volumes-to-fs-backup \ --secret-file ./velero-creds \ --uploader-type kopia \ --backup-location-config region=emcreg,s3ForcePathStyle=true,s3Url=http://s3-emcreg.ecs.metaminds.com:9021,insecureSkipTLSVerify=true
This one fails with no such bucket as well:
time="2023-07-05T09:24:22Z" level=error msg="Error listing backups in backup store" backupLocation=velero/default controller=backup-sync error="rpc error: code = Unknown desc = NoSuchBucket: The specified bucket does not exist\n\tstatus code: 404, request id: ac1e420e:188b3cb4ecf:97cc:c8e, host id: " error.file="/go/src/velero-plugin-for-aws/velero-plugin-for-aws/object_store.go:426" error.function="main.(*ObjectStore).ListCommonPrefixes" logSource="pkg/controller/backup_sync_controller.go:107"
I've also done a Wireshark of the commands and they do seem to add the region correctly in the request, but I can't figure out why it says no such bucket.
LE: Dug a bit more with Wireshark and it looks like it's searching for a bucket named emcreg/velero instead of just velero.
@mpsOxygen
For Kopia path, could you try with the normal endpoint as the s3Url http://ecs.metaminds.com:9021
? For Kopia path, the region could be set separately so don't need to embed it into s3Url. Installation command as below:
velero install
--provider aws
--plugins velero/velero-plugin-for-aws:v1.7.0
--bucket velero
--use-node-agent
--default-volumes-to-fs-backup
--secret-file ./velero-creds
--uploader-type kopia
--backup-location-config region=emcreg,s3ForcePathStyle=true,s3Url=http://ecs.metaminds.com:9021,insecureSkipTLSVerify=true/
Still fails without the the hack for the region on the F5:
`kubectl logs deployment/velero -n velero | grep error
Defaulted container "velero" out of: velero, velero-velero-plugin-for-aws (init) time="2023-08-23T08:00:56Z" level=error msg="Error listing backups in backup store" backupLocation=velero/default controller=backup-sync error="rpc error: code = Unknown desc = RequestError: send request failed\ncaused by: Get \"http://ecs.metaminds.com:9021/velero?delimiter=%2F&list-type=2&prefix=backups%2F\": EOF" error.file="/go/src/velero-plugin-for-aws/velero-plugin-for-aws/object_store.go:426" error.function="main.(ObjectStore).ListCommonPrefixes" logSource="pkg/controller/backup_sync_controller.go:107" time="2023-08-23T08:00:57Z" level=error msg="fail to validate backup store" backup-storage-location=velero/default controller=backup-storage-location error="rpc error: code = Unknown desc = RequestError: send request failed\ncaused by: Get \"http://ecs.metaminds.com:9021/velero?delimiter=%2F&list-type=2&prefix=\": EOF" error.file="/go/src/github.com/vmware-tanzu/velero/pkg/persistence/object_store.go:198" error.function="github.com/vmware-tanzu/velero/pkg/persistence.(objectBackupStore).IsValid" logSource="pkg/controller/backup_storage_location_controller.go:155" time="2023-08-23T08:00:57Z" level=error msg="Current BackupStorageLocations available/unavailable/unknown: 0/0/1, BackupStorageLocation \"default\" is unavailable: rpc error: code = Unknown desc = RequestError: send request failed\ncaused by: Get \"http://ecs.metaminds.com:9021/velero?delimiter=%2F&list-type=2&prefix=\": EOF)" controller=backup-storage-location logSource="pkg/controller/backup_storage_location_controller.go:192" `
We have given up on the DellEMC ECS and are going to use MinIO instead. Thanks for all the help.
I've faced the similar issue with cross-region AWS S3(I'm trying to restore backup from S3 in eu-west-1 to EKS in eu-central-1):
time="2023-08-30T20:13:43Z" level=error msg="unable to successfully complete pod volume restores of pod's volumes" error="backup repository is not ready: error running command=restic init --repo=s3:s3-eu-west-1.amazonaws.com/{{ BUCKET_NAME }}/restic/my-test-backups --password-file=/tmp/credentials/velero/velero-repo-credentials-repository-password --cache-dir=/scratch/.cache/restic, stdout=, stderr=Fatal: create repository at s3:s3-eu-west-1.amazonaws.com/{{ BUCKET_NAME }}/restic/my-test-backups failed: client.BucketExists: 301 Moved Permanently\n\n: exit status 1" logSource="pkg/restore/restore.go:1699" restore=velero/test-20230830221335
With the following backupstoragelocations.velero.io:
➜ k -n velero get backupstoragelocations.velero.io default -o yaml
apiVersion: velero.io/v1
kind: BackupStorageLocation
metadata:
annotations:
meta.helm.sh/release-name: velero
meta.helm.sh/release-namespace: velero
creationTimestamp: "2023-08-30T18:33:01Z"
generation: 250
labels:
app.kubernetes.io/instance: velero
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: velero
helm.sh/chart: velero-5.0.2
name: default
namespace: velero
resourceVersion: "14228053"
uid: 1483aea5-0a73-4edf-a58e-791e8eba6083
spec:
accessMode: ReadWrite
config:
region: eu-west-1
default: true
objectStorage:
bucket: {{ BUCKET_NAME }}
provider: aws
status:
lastSyncedTime: "2023-08-30T20:54:53Z"
lastValidationTime: "2023-08-30T20:55:13Z"
phase: Available
But this command works good:
➜ k -n velero exec -it node-agent-n6gtk -- restic init --repo=s3:s3-eu-west-1.amazonaws.com/{{ BUCKET_NAME }}/restic/my-test-backups -o s3.region=eu-west-1 --cache-dir=/scratch/.cache/restic
enter password for new repository:
enter password again:
created restic repository 292b3b2667 at s3:s3-eu-west-1.amazonaws.com/{{ BUCKET_NAME }}/restic/my-test-backups
So the issue is definitely exists. Should I create a separate issue about that? Or may be you can recommend me a workaround for that?
@natkondrashova This doesn't look the same with the original one, so please open a new issue. Besides, not sure which version are you using, if not the latest version, just try to use the latest version first.
What steps did you take and what happened:
Created a kind cluster and installed velero with the following command:
velero install \ --provider aws \ --plugins velero/velero-plugin-for-aws:v1.6.1 \ --bucket velero \ --use-node-agent \ --default-volumes-to-fs-backup \ --secret-file ./velero-creds \ --backup-location-config region=emcreg,s3ForcePathStyle="true",s3Url=http://ecs.metaminds.com:9021,insecureSkipTLSVerify=true \ --snapshot-location-config region=emcreg,s3ForcePathStyle="true",s3Url=http://ecs.metaminds.com:9021,insecureSkipTLSVerify=true
It installs fine, but when trying a velero backup create I get errors when doing restic init.
What did you expect to happen: The velero backup create command to work without errors.
The following information will help us better understand what's going on:
If you are using velero v1.7.0+:
Please use
velero debug --backup <backupname> --restore <restorename>
to generate the support bundle, and attach to this issue, more options please refer tovelero debug --help
If you are using earlier versions:
Please provide the output of the following commands (Pasting long output into a GitHub gist or other pastebin is fine.)
kubectl logs deployment/velero -n velero
https://pastebin.com/Sct1ugrXvelero backup describe <backupname>
orkubectl get backup/<backupname> -n velero -o yaml
`Name: bibi Namespace: velero Labels: velero.io/storage-location=default Annotations: velero.io/source-cluster-k8s-gitversion=v1.25.3 velero.io/source-cluster-k8s-major-version=1 velero.io/source-cluster-k8s-minor-version=25Phase: PartiallyFailed (run
velero backup logs bibi
for more information)Errors: Velero: name: /envoy-9kbh4 error: /failed to wait BackupRepository: backup repository is not ready: error running command=restic init --repo=s3:http://ecs.metaminds.com:9021/velero/restic/projectcontour --password-file=/tmp/credentials/velero/velero-repo-credentials-repository-password --cache-dir=/scratch/.cache/restic --insecure-tls=true, stdout=, stderr=Fatal: create key in repository at s3:http://ecs.metaminds.com:9021/velero/restic/projectcontour failed: Stat: Access Denied.
: exit status 1 name: /node-agent-dgfr4 error: /failed to wait BackupRepository: backup repository is not ready: error running command=restic init --repo=s3:http://ecs.metaminds.com:9021/velero/restic/velero --password-file=/tmp/credentials/velero/velero-repo-credentials-repository-password --cache-dir=/scratch/.cache/restic --insecure-tls=true, stdout=, stderr=Fatal: create key in repository at s3:http://ecs.metaminds.com:9021/velero/restic/velero failed: Stat: Access Denied.
: exit status 1 name: /velero-7db7f89669-9h7kv error: /backup repository is not ready: error running command=restic init --repo=s3:http://ecs.metaminds.com:9021/velero/restic/velero --password-file=/tmp/credentials/velero/velero-repo-credentials-repository-password --cache-dir=/scratch/.cache/restic --insecure-tls=true, stdout=, stderr=Fatal: create key in repository at s3:http://ecs.metaminds.com:9021/velero/restic/velero failed: Stat: Access Denied.
: exit status 1 Cluster:
Namespaces:
Namespaces: Included: * Excluded:
Resources: Included: * Excluded:
Cluster-scoped: auto
Label selector:
Storage Location: default
Velero-Native Snapshot PVs: auto
TTL: 720h0m0s
CSISnapshotTimeout: 10m0s ItemOperationTimeout: 1h0m0s
Hooks:
Backup Format Version: 1.1.0
Started: 2023-05-09 13:10:19 +0300 EEST Completed: 2023-05-09 13:10:27 +0300 EEST
Expiration: 2023-06-08 13:10:19 +0300 EEST
Velero-Native Snapshots:
`
velero backup logs <backupname>
https://pastebin.com/iLAtgCU0Anything else you would like to add:
The problem is with velero not passing on the region information to restic when it calls restic init. Because the DellEMC ECS does not have a region you need to pass it on to restic using the switch "-o s3.region=emcreg". The value for s3.region can be anything and restic init succeds, without it the region is left blank and you get a white space in the Authorization header where the region should be that leads to a failed authorization.
This is how the header looks after a restic init without -o s3.region= :
Authorization: AWS4-HMAC-SHA256 Credential=AKIA204A86AD42312497/20230320/ /s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=ec35cdcab04357e3c86ba480cf6235b841a82fe6dd150007957dd49347ded518\r\n
Notice the white space before /s3/aws4_request.
And this is how the header looks after a restic init with -o s3.region=emcreg:
Authorization: AWS4-HMAC-SHA256 Credential=AKIA204A86AD42312497/20230320/emcreg/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=4c035d6719d6f8405af4b58106ea5b2b3951cba0841913065150942642586fbe\r\n
Velero was installed with the region=emcreg set so it should pass it on to restic using -o s3.region=emcreg.
Environment:
Velero version (use
velero version
): Client: Version: v1.11.0 Git commit: - Server: Version: v1.11.0Velero features (use
velero client config get features
): features:Kubernetes version (use
kubectl version
): Client Version: version.Info{Major:"1", Minor:"26", GitVersion:"v1.26.2", GitCommit:"fc04e732bb3e7198d2fa44efa5457c7c6f8c0f5b", GitTreeState:"clean", BuildDate:"2023-02-22T13:32:21Z", GoVersion:"go1.20.1", Compiler:"gc", Platform:"linux/amd64"} Kustomize Version: v4.5.7 Server Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.3", GitCommit:"434bfd82814af038ad94d62ebe59b133fcb50506", GitTreeState:"clean", BuildDate:"2022-10-25T19:35:11Z", GoVersion:"go1.19.2", Compiler:"gc", Platform:"linux/amd64"}Kubernetes installer & version: kind v0.17.0 go1.19.2 linux/amd64
Cloud provider or hardware configuration: none, local cluster for testing, but this will eventuali end up in a Tanzu cluster.
OS (e.g. from
/etc/os-release
): Fedora release 38 (Thirty Eight)Vote on this issue!
This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.