tbarbugli / cassandra_snapshotter

A tool to backup cassandra nodes using snapshots and incremental backups on S3
Other
222 stars 122 forks source link

Make upload_file robust against failures of all kinds #75

Closed CunningBaldrick closed 8 years ago

CunningBaldrick commented 8 years ago

In particular it now handles failures of complete_upload (I've seen these in the wild). Also, in put_from_manifest, don't delete backup sstables if they failed to upload (as, unlike snapshots, you can't get them back after deleting them).

CunningBaldrick commented 8 years ago

Hi Tommaso, about continuing to upload if one file fails: if one file fails to upload it doesn't necessarily mean that all files will fail to upload. Since this is a backup tool I think it's best to get files into S3 if it's at all possible. That way, when your data centre burns down 5 minutes later you will have less work to do recovering from backups, because more stuff will be in S3! At worst, every file will fail to upload, but who cares?

As for deleting files that successfully uploaded even if other files failed to upload: I guess this might be problematic if the manifest failed to upload but other files successfully uploaded (and were thus deleted), what do you think? The advantage of deleting files even if other files failed to upload is that you reduce the amount of files you need to upload later when you run the tool again. Eg if you successfully uploaded 4999 files out of 5000, with one file failing, it's not nice that that one failed file means you have to re-upload the 4999 files next time you run the tool. However, now that upload is more robust this is less of a consideration.

tbarbugli commented 8 years ago

@CunningBaldrick makes sense :)