Closed masta79 closed 1 year ago
A single tarsnap command deleting multiple archives can be much faster (and use less bandwidth) than separate commands, since it keeps some metadata cached. For optimal performance (aka to make the cache as efficient as possible), sort the list of archives so that archives which share a lot of their contents are deleted consecutively.
In the case of hourly backups, "sort the list of archives so that the archives which share a lot of their contents" almost always means "sort them by date&time". And unless you have a very weird naming scheme, that means "sort them alphabetically".
Our official page about deleting multiple archives faster is https://www.tarsnap.com/improve-speed.html#faster-delete but that doesn't contain anything that @cperciva didn't mention. (and fact, it doesn't include the tip about sorting them, so I'll add that)
Thank you, I completely missed that part of the documentation, as in my mind "improve speed" was only associated with doing actual backups. I'll run the deletion again and will report back.
Sidenote: It might be beneficial to have a textfile input for the list of archives.
Hi @masta79,
Sidenote: It might be beneficial to have a textfile input for the list of archives.
That's the --archive-names
option, added in 1.0.38 (2017).
Sorry for wasting your time, sorry for not reading the documentation properly. Deletion now went through a lot faster, I'll update my scripts to use the options.
/me sees himself out
Hi @masta79, no problem! Tarsnap has a lot of nice options, but I'm still working on "discoverability", in terms of trying to make sure that people know how to find the info they need, without being overwhelmed by info that's not relevant to them at the moment.
Interactions such as this helps to guide me towards writing better docs. :)
I'm cleaning up a large bunch of old hourly backups and run into the problem that the deletion per archive takes 10-15 minutes, and while reducing the required storage by only 1.5GB, the total server -> client bandwidth for this operation was about 15GB. Is this expected?
Is there a more optimized way to delete large bulks of backups, i could not see a difference in deleting them by calling tarsnap -d once per archive, or listing all at once with multiple -f arguments.