markfasheh / duperemove

Tools for deduping file systems
GNU General Public License v2.0
816 stars 81 forks source link

mtime and ctime (v 0.10) #103

Closed guni77 closed 8 years ago

guni77 commented 9 years ago

I noticed that on a volume with snapshoots I ran duperemove, the mtime got modified by the deduplication process. This issue is also addressed in Issue 51. And as I understand it happens within the kernel, not within the duperemove tool. Anyway if you rysnc some files and run the dupremove, and then rsync the files again, the got overwritten because of changed mtime and thus the files you just deduplicated become duplicated again. Not really critical Data loss, but I think you expect the mtime to stay the same and only ctime to be modified. Besides thanks for the software saved me 800GB ;)

kakra commented 9 years ago

You may want to use rsync with --no-whole-file and --inplace on COW file systems so that only modified blocks are rewritten - with the downside that on rsync interruption you have an inconsistent file (rsync usually constructs a new copy of the file and then moves it inplace).

Thus I recommend working with a scratch area and take a snapshot of it only when rsync successfully finished for consistency reasons.

This should make the mtime/ctime problem go away, tho it doesn't catch moved/renamed files (I'm planning to catch that from the rsync logs and pipe it to duperemove in my own backup script).

fezie commented 9 years ago

Kernel 4.2 has this fixed. mtime and ctime doestn't get anymore changed by duperemove

markfasheh commented 8 years ago

As fezie said, this should be fixed in newer kernels so I'm going to close for now. If you can try 4.2 and still have the problem please reopen. Also thanks for the kind words :)