Closed TzahiAshkenazi closed 9 months ago
@sseago @shubham-pampattiwar @shawn-hurley FYI.. thank you @TzahiAshkenazi!
I'm going to take this one on. From initial analysis, it looks like the root cause is the use of generateName
in cases where there are a large number of resources created with the same root string. The GenerateName func doesn't check uniqueness, it just generates the name field with the root plus a random 6-char string. If the resulting name already exists, it returns an error. The caller needs to check IsAlreadyExistsError
and try again if true. The fix should be to replace our Create calls that use GenerateName with a CreateWithRetries func which will retry for a small number of times (probably 3 to 5) in the GenerateName case before returning an error. I'll submit a PR for this.
I've seen this same error in the past when generating test data from a bash script, creating objects from yaml using generateName
via CLI.
I don't think we need to retry whenever a resource is created.
Could you clarify whether this is highly reproducible?
@reasonerjt The only time we need to retry is when we set generateName
and don't set name
. This is only done in a limited number of places in velero which should be handled in the associated PR. The underlying issue is that the code which processes generateName
appends a random string to the name but does not check to see whether a resource already exists with the combined name. If there's a clash, then it returns an already exists error. This is documented behavior and retry is recommended as a response to this error. Beyond a certain number of resources, clashes like this become quite likely. The client-go retry package makes this relatively straightforward, providing a util func to allow a certain number of automatic retries in response to a particular error type.
I've done one test of String(n int) string function, below is statistics of the results of a run: 5 char for 1k calls, repeated: 0, repeat rate: 0.00% 5 char for 5k calls, repeated: 1, repeat rate: 0.02% 5 char for 10k calls, repeated: 5, repeat rate: 0.05% 5 char for 100k calls, repeated: 315, repeat rate: 0.32% 5 char for 1000k calls, repeated: 34012, repeat rate: 3.40%
increasing the length of random alphanumeric string: 6 char for 1k calls, repeated: 0, repeat rate: 0.00% 6 char for 5k calls, repeated: 0, repeat rate: 0.00% 6 char for 10k calls, repeated: 0, repeat rate: 0.00% 6 char for 100k calls, repeated: 6, repeat rate: 0.01% 6 char for 1000k calls, repeated: 1330, repeat rate: 0.13%
The effect is most obvious when increasing the length of random alphanumeric string, and retry when a collision happening also works.
In our real-world scenario, the number of resources created with the same prefix within a namespace does not exceed 1 million (0.32% in the above run). If we attempt to create resources with the same prefix twice, the probability of them being duplicated is 0.32% * 0.32%, which is 0.001%, which means if we create over 100,000 resources in one namespace may have one name repeated. This probability is extremely low.
update please note that this bug also Occurs on kopia restore operation
[root@f07-h27-000-r640 velero]# ./velero restore describe restore-kopia-busybox-perf-single-5000-pods-rbd -nvelero-1-12
Name: restore-kopia-busybox-perf-single-5000-pods-rbd
Namespace: velero-1-12
Labels: <none>
Annotations: <none>
Phase: PartiallyFailed (run 'velero restore logs restore-kopia-busybox-perf-single-5000-pods-rbd' for more information)
Total items to be restored: 40022
Items restored: 40022
Started: 2023-09-19 08:50:47 +0000 UTC
Completed: 2023-09-19 19:00:28 +0000 UTC
Errors:
Velero: podvolumerestores.velero.io "restore-kopia-busybox-perf-single-5000-pods-rbd-dxx7r" already exists, the server was not able to generate a unique name for the object
podvolumerestores.velero.io "restore-kopia-busybox-perf-single-5000-pods-rbd-5xmxg" already exists, the server was not able to generate a unique name for the object
Cluster: <none>
Namespaces: <none>
Backup: backup-kopia-busybox-perf-single-5000-pods-rbd
Namespaces:
Included: all namespaces found in the backup
Excluded: <none>
Resources:
Included: *
Excluded: nodes, events, events.events.k8s.io, backups.velero.io, restores.velero.io, resticrepositories.velero.io, csinodes.storage.k8s.io, volumeattachments.storage.k8s.io, backuprepositories.velero.io
Cluster-scoped: auto
Namespace mappings: <none>
Label selector: <none>
Restore PVs: auto
Existing Resource Policy: <none>
ItemOperationTimeout: 4h0m0s
Preserve Service NodePorts: auto
5 char for 5k calls, repeated: 1, repeat rate: 0.02%
, basically it would occur for the randomly generated PVR name with 5k pods.
So in the above restore example, we're seeing that this is common enough to happen twice in a single restore.
@TzahiAshkenazi note that my PR should handle the restore case too. I've basically taken every instance where Velero relies on GenerateName and used the client-go/util/retry func in all cases to retry up to 5 times when generated name is repeated.
thanks for the update @sseago
@sseago
when we can expect this fix upstream (which velero version)
thanks
@TzahiAshkenazi I just updated the fix PR to resolve a conflict with recent commits on main. At this point, the PR has one approval, so as soon as a second maintainer approves it, it will be merged (and thus on-target for Velero 1.13 inclusion)
@shubham-pampattiwar @reasonerjt Can we also consider this for 1.12.1? The bug is currently impacting our performance testing against 1.12. I can submit a cherry-pick PR against release-1.12 once the PR against main merges.
+1 for velero 1.12.1
What steps did you take and what happened: Running kopia backup . create SN with 5000 pods & PVC size 32MB storage class = cephfs backup CR failed exit on : "PartiallyFailed"
Steps & Commands:
Create single namespaces with 5000 Pods&PVs. (Each PV size is32MB without data , created using BusyBox.) sc=cephfs
Created Backup CR
cr_name': 'backup-kopia-busybox-perf-single-5000-pods-cephfs ./velero backup create backup-kopia-busybox-perf-single-5000-pods-cephfs --include-namespaces busybox-perf-single-ns-5000-pods --default-volumes-to-fs-backup=true --snapshot-volumes=false -n velero-1-12
backup CR failed and exit on "PartiallyFailed" after 2:09:33 HH::MM:SS
Errors :
What did you expect to happen: backup CR will complete successfully
Anything else you would like to add: attached a link as the possible root cause in the issue. https://github.com/kubernetes/apiserver/blame/master/pkg/storage/names/generate.go#L45
Environment:
OS (e.g. from
/etc/os-release
):NAME="Red Hat Enterprise Linux CoreOS" ID="rhcos" ID_LIKE="rhel fedora" VERSION="411.86.202306021408-0" VERSION_ID="4.11" PLATFORM_ID="platform:el8" PRETTY_NAME="Red Hat Enterprise Linux CoreOS 411.86.202306021408-0 (Ootpa)" ANSI_COLOR="0;31" CPE_NAME="cpe:/o:redhat:enterprise_linux:8::coreos" HOME_URL="https://www.redhat.com/" DOCUMENTATION_URL="https://docs.openshift.com/container-platform/4.11/" BUG_REPORT_URL="https://bugzilla.redhat.com/" REDHAT_BUGZILLA_PRODUCT="OpenShift Container Platform" REDHAT_BUGZILLA_PRODUCT_VERSION="4.11" REDHAT_SUPPORT_PRODUCT="OpenShift Container Platform" REDHAT_SUPPORT_PRODUCT_VERSION="4.11" OPENSHIFT_VERSION="4.11" RHEL_VERSION="8.6" OSTREE_VERSION="411.86.202306021408-0"
P.S i couldn't run "velero debug command " since i re-deploy velero for new test cases , if the attached logs missing some info which is needed , i can reproduce the issue again
kopia_logs.tar.gz
Vote on this issue!
This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.