Easy part
Add a parameter for X days of retention
Delete snapshots older than X every 24 hours or so
Tricky part - Garbage Collection + Checking
Implement a parameter to choose when to do GC
When GC is running no backup writing must be possible, 503 service unavailable ( maybe in the future can be done, but keep it simple for now )
GC Will do the following, download one after one all FIDX and DIDX, create an hashmap of all referenced chunks
List all S3 Objects in chunks/ prefix
Delete unreferenced
Send an email or warn critically about referenced missing chunks, and optimally move the impacted backup in a folder like "corrupted" , so that incremental of next ones will not assume that the chunks are ok
Checksumming chunks like happens in PBS imho is useless because all S3 services have internally checksumming, when uploads take place via SSL, corruption will in 99.99% cases just drop then connection , also without SSL , have to check but i think S3 already also uses a checksum
Truncated cannot happen because of Content-Length
If the data itself is damaged before, well , you got a bigger problem at hand :)
Easy part Add a parameter for X days of retention Delete snapshots older than X every 24 hours or so
Tricky part - Garbage Collection + Checking Implement a parameter to choose when to do GC When GC is running no backup writing must be possible, 503 service unavailable ( maybe in the future can be done, but keep it simple for now )
GC Will do the following, download one after one all FIDX and DIDX, create an hashmap of all referenced chunks List all S3 Objects in chunks/ prefix Delete unreferenced Send an email or warn critically about referenced missing chunks, and optimally move the impacted backup in a folder like "corrupted" , so that incremental of next ones will not assume that the chunks are ok
Checksumming chunks like happens in PBS imho is useless because all S3 services have internally checksumming, when uploads take place via SSL, corruption will in 99.99% cases just drop then connection , also without SSL , have to check but i think S3 already also uses a checksum Truncated cannot happen because of Content-Length If the data itself is damaged before, well , you got a bigger problem at hand :)
Any suggestions before i work that are welcome