rclone / rclone

"rsync for cloud storage" - Google Drive, S3, Dropbox, Backblaze B2, One Drive, Swift, Hubic, Wasabi, Google Cloud Storage, Azure Blob, Azure Files, Yandex Files
https://rclone.org
MIT License
47.26k stars 4.23k forks source link

Ability to restore a snapshot of the filesystem at a specific timestamp #2126

Open cowwoc opened 6 years ago

cowwoc commented 6 years ago

What is the problem you are having with rclone?

When using versioned remotes, like B2, I would like to be able to restore the contents of my system at a given date. Although the information is technically available, it is extremely difficult from a usability perspective to figure out which files needs to be downloaded, and renaming them as necessary so they match the original filename (e.g. b2 appends the store timestamp to the filename. We'd need to reverse this process).

What is your rclone version (eg output from rclone -V)

rclone v1.39

Which OS you are using and how many bits (eg Windows 7, 64 bit)

Windows 10, 64bit

Which cloud storage system are you using? (eg Google Drive)

B2

The command you were trying to run (eg rclone copy /tmp remote:tmp)

rclone copy remote:blah blah

Recapping what I wrote on https://forum.rclone.org/t/how-to-restore-view-of-system-at-a-specific-timestamp/5004/5:

https://rclone.org/b2/ explains how to list file versions and restore a specific version of a file, one at a time.

I am trying to protect myself from the possibility where the latest version of my filesystem has become corrupt by a virus. Is there a mechanism that will allow me to restore the view of the filesystem at a particular date/time?

Specifically, I expect to be able to to COPY files from the remote to the local filename, providing a single date/time. I expect B2 to find the latest version of a file on or before this date/time and COPY it to the local filesystem without the date/time suffix.

Meaning:

  • I would invoke: rclone copy --latest-before=2018-03-08-205431-000 remote: local:
  • rclone would find files test.txt-v2018-03-08-205524-000, test2.txt-v2017-03-08-205524-000 as the latest file before the date/time
  • rclone would restore target filenames “test.txt” and “test2.txt” (suffix removed)
ncw commented 6 years ago

This would probably be a b2 specific command, say --b2-latest-before datespec. It wouldn't be too tricky to implement. If anyone would like to go I'd happily spec out what needs to be done.

wolfv6 commented 6 years ago

This would be a sought after feature. People write about it: Issue 18 incremental strategy https://forum.rclone.org/t/sync-to-target-folder-ransomware-protection/3658 https://forum.rclone.org/t/backblaze-b2-point-in-time-recovery/3030

wolfv6 commented 6 years ago

The following example shows a way of implementing the fabled "restore snapshot at timestamp". It's not like the usual incremental backup, more like a "Time Machine", "Back in Time", or "rsnapshot", but without the hard links. This arrangement makes it easier to brows a file's version history.

Daily backup command:

rclone sync $source remote:last_snapshot --backup-dir="old_files" --suffix="$day"

The following table shows the flow of file f being uploaded, modified, and moved. 'change' is a filesystem change immediately before the daily backup. 'f' is the file name, it's prefix is mod_time, and suffix is move_time (--suffix="$day"):

    | scenario A                           | scenario B
    |                     last_    old_    |                     last_    old_
day | change       source snapshot files   | change       source snapshot files
----|--------------------------------------|-----------------------------------
 1  | add f        1f     1f               | add          1f     1f 
 2  | overwrite f  2f     2f       1f2     | delete f                     1f2
 3  |              2f     2f               |
 4  |              2f     2f               | add f        4f     4f 
 5  | overwrite f  5f     5f       1f2,2f5 | overwrite f  5f     5f       1f2,4f5

For "restore to day 3", restore files in last_snapshot (mod_time <= 3) and in old_files (mod_time <= 3 && move_time > 3). In scenario A, "restore to day 3" would restore file 1f2. In scenario B, "restore to day 3" wouldn't restore any file.

Restore at timestamp pseudo code:

files_list = rclone ls remote:last_snapshot
for each file in files_list
    if (mod_time <= timestamp)
        restore file

files_list = rclone ls remote:old_files
for each file in files_list
    if (mod_time <= timestamp && move_time > timestamp)
        restore file
bulletmark commented 6 years ago

I wanted this feature for my B2 archive so at least until this feature is implemented I created a simple utility which can be used with rclone. See https://github.com/bulletmark/b2restore.

austinginder commented 6 years ago

I'm not familiar enough with golang to work on directly however I likewise created a B2 Time Machine script written in bash which restores files to a particular time using Rclone. It uses Rclone's --b2-versions, --min-age, sync and copyto to pull it off. That said it has a limitation. It's not possible for it to detect if a previous version is needed. That means a restore may include file versions which were actually deleted.

It should be possible to weed those out using B2's b2_list_file_versions. I believe Rclone is already using to filter out the deleted revisions so maybe have Rclone use those deleted versions, the one's marked hide, for purposes of recreating an accurate file structure for a particular timestamp.

austinginder commented 4 years ago

@ncw I still think this would be an amazing enhancement for Rclone. I could attempt to help contribute something for this however I'm not sure how far I will get. You mentioned about specing something out. Feel free to do so 😁.

I'm also wondering if an enhancement like this could be used with rclone mount. I'm thinking something like rclone mount b2-account:b2-bucket/ ~/Tmp/timemachine --b2-latest-before <datespec> which should allow you to mount and view the files as they existed at that time however using the corrected b2 versions under the hood.

ivandeex commented 4 years ago

On a locally mounted FS you can use Linux LVM to make an instant snapshot of your data, then sync it (and remove temporary snapshot):

LV=snap-$(date +%s)
TD=$(mktemp -d)
lvcreate -s -n $LV -L1G /dev/volume/store
mount $LV $TD
rsync -av $TD/ /store/
umount $TD
lvremove $MV
swazrgb commented 2 years ago

For b2, I implemented this in https://github.com/rclone/rclone/pull/6035

Simply run rclone with --b2-version-at=2022-03-13T00:00:00Z or set it as a backend option. You can then use sync, mount, etc. to get your files as they were at that time.