tbarbugli / cassandra_snapshotter

A tool to backup cassandra nodes using snapshots and incremental backups on S3
Other
222 stars 122 forks source link

Incremental removes files for upload even if the upload fails #68

Open CunningBaldrick opened 8 years ago

CunningBaldrick commented 8 years ago

I've been reading the code before trying this out, and I noticed that when incremental_backups is True, put_from_manifest will delete files even if they failed to upload:

... for ret in pool.imap(upload_file, ((bucket, f, destination_path(s3_base_path, f), s3_ssenc, buffer_size) for f in files)): if not ret: exit_code = 1 break pool.terminate() if incremental_backups: for f in files: os.remove(f) # DELETING FILES HERE EVEN IF exit_code == 1 exit(exit_code)

This doesn't seem wise. The easy solution is to make deletion conditional on the value of exit_code. It would be neater to delete those files that uploaded correctly and leave the files that didn't alone, but maybe it's overkill.

tbarbugli commented 8 years ago

what would it be the benefit of keeping those files on disk?

CunningBaldrick commented 8 years ago

Hi Tommaso,

On 03/12/15 14:32, Tommaso Barbugli wrote:

what would it be the benefit of keeping those files on disk?

my understanding of how this is supposed to work is: (1) every time Cassandra makes a new sstable, it drops a hard link to it in a "backups" directory; (2) when you run cassandra_snapshotter in incremental mode, it uploads every file in the backups directories, then deletes those files; (3) the upload and deletion is done using put_from_manifest. If that's not how it is (particularly point 3) then this bug report probably doesn't make any sense :)

Now consider the case where you have "backups/X", you are trying to upload X to AWS, but something goes wrong in the upload in put_from_manifest. Then there is no copy of X on AWS. There is also no longer an X hard link in backups any more because put_from_manifest deleted it. Cassandra isn't going to create a new hard link for it in "backups", since it only adds them when the sstable is first created (AFAIK). Therefore no future run of cassandra_snapshotter in incremental mode will ever upload it (because X isn't in "backups"). Weeks go by, you are happy that everything was backed up, as you are doing an incremental backup every night, then your cluster blows up. You are so happy to have backups! But then you discover that some data is missing. It's the data from the sstable X, which was never uploaded. What a sad story! I'm hoping you are going to tell me that it can never happen and we can all sleep soundly at night :)

Jaikant commented 8 years ago

@CunningBaldrick Did you create any kind of fix for this? If so could you share