coreos / etcd-operator

etcd operator creates/configures/manages etcd clusters atop Kubernetes
https://coreos.com/blog/introducing-the-etcd-operator.html
Apache License 2.0
1.75k stars 741 forks source link

Fix max backup deletion #2116

Open mdkrajewski opened 5 years ago

mdkrajewski commented 5 years ago

etcd-backup-operator: fixed sorting of etcd backups to ensure deletion of only the oldest backups when rotating

Issue: When rotating backups based on a maximum number allowed, newer backups were being deleted when the number of digits in an etcd store revision increased. E.g., when "v99" became "v100", the newer "v100" backup was being deleted, because the "1" in "100" was sorted to come before the first "9" in "99".

Fix: This new sorting method relies on the timestamp in each backup path, rather than the revision number. The reason the timestamp is given precedence over the revision when sorting is because the revision could potentially go backwards when an older backup is restored. See my comment in pkg/backup/util/util.go's SortableBackupPaths type for more details.

Fixes #2113

alaypatel07 commented 5 years ago

Thanks, @mdkrajewski for the PR. I just glanced through the PR a quick question: since the backup file names are stored in the format of s3Path+"_v%d_%s", would exploding the file path string with _ and sorting based on the last string in the list (time) work? Would that be a more maintainable way of achieving the same thing?

Sorry if I missed anything, will dig into the details soon

mdkrajewski commented 5 years ago

...would exploding the file path string with _ and sorting based on the last string in the list (time) work? Would that be a more maintainable way of achieving the same thing?

@alaypatel07 Thanks for your comment! Your idea should work, and I think produce a little less code, and should be a little faster. However, one of my goals was to apply some semantics to the paths, so the utility function knows that it's comparing timestamps - just in case something happens to that filename after the fact, and to be somewhat resilient in case someone changes the format used by BackupManager. So in that respect, which approach is more maintainable might be a matter of perspective. I could go either way, though. If people such as yourself feel that my approach is too fancy, I could do what you suggest.