vmware-tanzu / velero

Backup and migrate Kubernetes applications and their persistent volumes
https://velero.io
Apache License 2.0
8.42k stars 1.37k forks source link

Kopia Backup failed on 5k pods on sn , exit on " Podvolumebackups already exists, the server was not able to generate a unique name for the object. #6807

Closed TzahiAshkenazi closed 9 months ago

TzahiAshkenazi commented 10 months ago

What steps did you take and what happened: Running kopia backup . create SN with 5000 pods & PVC size 32MB storage class = cephfs backup CR failed exit on : "PartiallyFailed"

Steps & Commands:

  1. Create single namespaces with 5000 Pods&PVs. (Each PV size is32MB without data , created using BusyBox.) sc=cephfs

  2. Created Backup CR
    cr_name': 'backup-kopia-busybox-perf-single-5000-pods-cephfs ./velero backup create backup-kopia-busybox-perf-single-5000-pods-cephfs --include-namespaces busybox-perf-single-ns-5000-pods --default-volumes-to-fs-backup=true --snapshot-volumes=false -n velero-1-12

  3. backup CR failed and exit on "PartiallyFailed" after 2:09:33 HH::MM:SS

Errors :

time="2023-09-11T12:46:42Z" level=error msg="Error backing up item" backup=velero-1-12/backup-kopia-busybox-perf-single-5000-pods-cephfs error="podvolumebackups.velero.io \"backup-kopia-busybox-perf-single-5000-pods-cephfs-9nctb\" already exists, the server was not able to generate a unique name for the object" logSource="pkg/backup/backup.go:448" name=busybox-perf-single-ns-5000-pods-3126
time="2023-09-11T13:44:22Z" level=info msg="1 errors encountered backup up item" backup=velero-1-12/backup-kopia-busybox-perf-single-5000-pods-cephfs logSource="pkg/backup/backup.go:444" name=busybox-perf-single-ns-5000-pods-674
time="2023-09-11T13:44:22Z" level=error msg="Error backing up item" backup=velero-1-12/backup-kopia-busybox-perf-single-5000-pods-cephfs error="podvolumebackups.velero.io \"backup-kopia-busybox-perf-single-5000-pods-cephfs-nt9nb\" already exists, the server was not able to generate a unique name for the object" logSource="pkg/backup/backup.go:448" name=busybox-perf-single-ns-5000-pods-674
time="2023-09-11T13:52:31Z" level=info msg="1 errors encountered backup up item" backup=velero-1-12/backup-kopia-busybox-perf-single-5000-pods-cephfs logSource="pkg/backup/backup.go:444" name=busybox-perf-single-ns-5000-pods-962
time="2023-09-11T13:52:31Z" level=error msg="Error backing up item" backup=velero-1-12/backup-kopia-busybox-perf-single-5000-pods-cephfs error="podvolumebackups.velero.io \"backup-kopia-busybox-perf-single-5000-pods-cephfs-6rhqd\" already exists, the server was not able to generate a unique name for the object" logSource="pkg/backup/backup.go:448" name=busybox-perf-single-ns-5000-pods-962

What did you expect to happen: backup CR will complete successfully

Anything else you would like to add: attached a link as the possible root cause in the issue. https://github.com/kubernetes/apiserver/blame/master/pkg/storage/names/generate.go#L45

Environment:

P.S i couldn't run "velero debug command " since i re-deploy velero for new test cases , if the attached logs missing some info which is needed , i can reproduce the issue again

kopia_logs.tar.gz

Vote on this issue!

This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.

weshayutin commented 10 months ago

@sseago @shubham-pampattiwar @shawn-hurley FYI.. thank you @TzahiAshkenazi!

sseago commented 10 months ago

I'm going to take this one on. From initial analysis, it looks like the root cause is the use of generateName in cases where there are a large number of resources created with the same root string. The GenerateName func doesn't check uniqueness, it just generates the name field with the root plus a random 6-char string. If the resulting name already exists, it returns an error. The caller needs to check IsAlreadyExistsError and try again if true. The fix should be to replace our Create calls that use GenerateName with a CreateWithRetries func which will retry for a small number of times (probably 3 to 5) in the GenerateName case before returning an error. I'll submit a PR for this.

sseago commented 10 months ago

I've seen this same error in the past when generating test data from a bash script, creating objects from yaml using generateName via CLI.

kaovilai commented 10 months ago

related to https://github.com/kubernetes/kubernetes/issues/115489

reasonerjt commented 9 months ago

I don't think we need to retry whenever a resource is created.

Could you clarify whether this is highly reproducible?

sseago commented 9 months ago

@reasonerjt The only time we need to retry is when we set generateName and don't set name. This is only done in a limited number of places in velero which should be handled in the associated PR. The underlying issue is that the code which processes generateName appends a random string to the name but does not check to see whether a resource already exists with the combined name. If there's a clash, then it returns an already exists error. This is documented behavior and retry is recommended as a response to this error. Beyond a certain number of resources, clashes like this become quite likely. The client-go retry package makes this relatively straightforward, providing a util func to allow a certain number of automatic retries in response to a particular error type.

qiuming-best commented 9 months ago

I've done one test of String(n int) string function, below is statistics of the results of a run: 5 char for 1k calls, repeated: 0, repeat rate: 0.00% 5 char for 5k calls, repeated: 1, repeat rate: 0.02% 5 char for 10k calls, repeated: 5, repeat rate: 0.05% 5 char for 100k calls, repeated: 315, repeat rate: 0.32% 5 char for 1000k calls, repeated: 34012, repeat rate: 3.40%

increasing the length of random alphanumeric string: 6 char for 1k calls, repeated: 0, repeat rate: 0.00% 6 char for 5k calls, repeated: 0, repeat rate: 0.00% 6 char for 10k calls, repeated: 0, repeat rate: 0.00% 6 char for 100k calls, repeated: 6, repeat rate: 0.01% 6 char for 1000k calls, repeated: 1330, repeat rate: 0.13%

The effect is most obvious when increasing the length of random alphanumeric string, and retry when a collision happening also works.

In our real-world scenario, the number of resources created with the same prefix within a namespace does not exceed 1 million (0.32% in the above run). If we attempt to create resources with the same prefix twice, the probability of them being duplicated is 0.32% * 0.32%, which is 0.001%, which means if we create over 100,000 resources in one namespace may have one name repeated. This probability is extremely low.

TzahiAshkenazi commented 9 months ago

update please note that this bug also Occurs on kopia restore operation


[root@f07-h27-000-r640 velero]# ./velero  restore describe restore-kopia-busybox-perf-single-5000-pods-rbd  -nvelero-1-12
Name:         restore-kopia-busybox-perf-single-5000-pods-rbd
Namespace:    velero-1-12
Labels:       <none>
Annotations:  <none>

Phase:                       PartiallyFailed (run 'velero restore logs restore-kopia-busybox-perf-single-5000-pods-rbd' for more information)
Total items to be restored:  40022
Items restored:              40022

Started:    2023-09-19 08:50:47 +0000 UTC
Completed:  2023-09-19 19:00:28 +0000 UTC
Errors:
  Velero:   podvolumerestores.velero.io "restore-kopia-busybox-perf-single-5000-pods-rbd-dxx7r" already exists, the server was not able to generate a unique name for the object
            podvolumerestores.velero.io "restore-kopia-busybox-perf-single-5000-pods-rbd-5xmxg" already exists, the server was not able to generate a unique name for the object
  Cluster:    <none>
  Namespaces: <none>

Backup:  backup-kopia-busybox-perf-single-5000-pods-rbd

Namespaces:
  Included:  all namespaces found in the backup
  Excluded:  <none>

Resources:
  Included:        *
  Excluded:        nodes, events, events.events.k8s.io, backups.velero.io, restores.velero.io, resticrepositories.velero.io, csinodes.storage.k8s.io, volumeattachments.storage.k8s.io, backuprepositories.velero.io
  Cluster-scoped:  auto

Namespace mappings:  <none>

Label selector:  <none>

Restore PVs:  auto

Existing Resource Policy:   <none>
ItemOperationTimeout:       4h0m0s

Preserve Service NodePorts:  auto
qiuming-best commented 9 months ago

5 char for 5k calls, repeated: 1, repeat rate: 0.02%, basically it would occur for the randomly generated PVR name with 5k pods.

sseago commented 9 months ago

So in the above restore example, we're seeing that this is common enough to happen twice in a single restore.

@TzahiAshkenazi note that my PR should handle the restore case too. I've basically taken every instance where Velero relies on GenerateName and used the client-go/util/retry func in all cases to retry up to 5 times when generated name is repeated.

TzahiAshkenazi commented 9 months ago

thanks for the update @sseago

TzahiAshkenazi commented 9 months ago

@sseago when we can expect this fix upstream (which velero version)
thanks

sseago commented 9 months ago

@TzahiAshkenazi I just updated the fix PR to resolve a conflict with recent commits on main. At this point, the PR has one approval, so as soon as a second maintainer approves it, it will be merged (and thus on-target for Velero 1.13 inclusion)

@shubham-pampattiwar @reasonerjt Can we also consider this for 1.12.1? The bug is currently impacting our performance testing against 1.12. I can submit a cherry-pick PR against release-1.12 once the PR against main merges.

shubham-pampattiwar commented 9 months ago

+1 for velero 1.12.1