scylladb / scylla-manager

The Scylla Manager
https://manager.docs.scylladb.com/stable/
Other
48 stars 33 forks source link

Reformat snapshot tag #3873

Open Michal-Leszczynski opened 1 month ago

Michal-Leszczynski commented 1 month ago

Right now, SM snapshot tag is created like that:

// SnapshotTagAt creates new snapshot tag for specified time.
func SnapshotTagAt(t time.Time) string {
    return "sm_" + t.UTC().Format(tagDateFormat) + "UTC"
}

Snapshot tag is "random" only with respect to creation time with the precision of the whole second! This in theory allows for 2 different backups to have the same snapshot tag. There are several scenarios when this could happen:

So the snapshot tag uniqueness is really weak in therms of any guarantees. The problem starts when two backups with the same snapshot tag end up in the same backup location. In the context of stored files, they are still differentiated by cluster ID and task ID in manifest paths. Unfortunately, as SM restore does not take any params like '--backup-cluster-id' or '--backup-task-id', it is impossible for it to decide which backup should be chosen, and right now SM would restore BOTH backups.

Adding mentioned params to restore seems like a bad idea, since snapshot tag should be enough to specify, which backup should be restored. The best would be to simply change snapshot tag format to some time base UUID and stop worrying about such collisions.

Even though snapshot tag collisions might not be common in the real life scenarios (but still possible), they are really annoying in tests with single backup location and many small backups running all the time. Also, this could simplify the backup bucket format and improve the speed of finding correct files.