digint / btrbk

Tool for creating snapshots and remote backups of btrfs subvolumes
https://digint.ch/btrbk/
GNU General Public License v3.0
1.71k stars 123 forks source link

Help understanding snapshot vs backup #168

Open Steve8291 opened 7 years ago

Steve8291 commented 7 years ago

I'm pretty new to using btrfs and just installed btrbk. I was wondering if someone could help explain some of the dynamics to me. I have two btrfs arrays: data_array and backup_array. I have set up btrbk to make snapshots of a subvolume in the mnt/data_array/mydata These are stored on the data_array. Is it correct that they take up very little space but simply reference the blocks of data on the data_array? So if data becomes corrupted on data_array it's similarly corrupted on those snapshots?

Then I also have btrbk sending backups to a separate set of drives in the backup_array onto a subvolume mnt/backup_array/backups The first one took 1.5 hours to transfer but each one after that has only taken seconds. I'm trying to understand how it's all tied together. Is the first transfer a complete copy of the original data files? And then are the subsequent snapshots referencing that first transfer that is on the backup subvolume? I'm guessing that is the case but wasn't sure. In this respect if I experience complete loss of my data_array then I can still restore off the backup_array? However, if that first large snapshot that was transferred over to the backup_array gets corrupted then all my other backups are gone as well? Are those the incremental backups? And do I need to somehow be setting up full backups?

Thanks for any help in explaining this to me.

digint commented 7 years ago

Seems you got it right.

Is it correct that they take up very little space but simply reference the blocks of data on the data_array? So if data becomes corrupted on data_array it's similarly corrupted on those snapshots?

Yes

Then I also have btrbk sending backups to a separate set of drives [...] Is the first transfer a complete copy of the original data files? And then are the subsequent snapshots referencing that first transfer that is on the backup subvolume? In this respect if I experience complete loss of my data_array then I can still restore off the backup_array?

Yes

However, if that first large snapshot that was transferred over to the backup_array gets corrupted then all my other backups are gone as well?

Yes, as the unchanged regions of incremental backups share the same data as the initial transfer.

And do I need to somehow be setting up full backups?

Well if you fear that your data on the backup disk gets corrupted in specific regions, then probably yes. A new full backup will duplicate the data on the backup disk. On the other hand, if you are considering doing this, you might be better off backuping to multiple disks from the beginning (several btrbk targets), or backup to some RAID array.

ghost commented 7 years ago

@Steve8291 I had similar questions about BTRFS backup strategy, I have a partial solution here, but something about snapshots is still not clear. As far as I understand if a file is changed and I have snapshots from the previous and the current versions too, then the old version is somewhere saved hidden and can be visible only if I restore the old snapshot. What happens if this old file is corrupted and the scrub can't fix it? Will it be checked and can I find it with dmesg? If so, should I load the snapshot by both on the data and on the backup drives, replace the file from backup and load the current snapshot after that? Be aware that it is much faster to fix only the corrupted files instead of doing full backup/restore. I am not sure how these are or aren't solved by btrbk, but ofc. I am curious.

(Just an idea, it might be good if the readme file would contain a recommended restoring strategy instead of just claiming that it should be solved manually.)

digint commented 7 years ago

Well, there are many possible errors that can be caused by corruption. Some can be recovered automatically (by duplicate metadata, RAID, etc.), some can not. I can not give a solution which covers everything here. If an unrecoverable error occurs, I would suggest to find something similar on the mailing list, then ask on the mailing list or IRC (#btrfs).

Just an idea, it might be good if the readme file would contain a recommended restoring strategy instead of just claiming that it should be solved manually.

I deliberately do not suggest any restoring strategy, as there are so many. Everybody has his own requirements, and the restore strategy to choose depends on the error type encountered, and probably also on the skills of the user.

The obvious simplest strategy would be: "by a new disk, and send/receive from backup"