Closed mbo77 closed 2 years ago
This is likely due to the fact that even though the actual contents are the same, the extent layout is different. The way to work around this is to use the block dedupe (as opposed to extent dedupe option). This is explained in the FAQ in the man page: https://github.com/markfasheh/duperemove/blob/master/duperemove.8#L338
You can use block-based dedupe by using --lookup-extents=no
option, or running duperemove with --write-hashes-v2
Thank you for the quick reply, this sounds reasonable. I will give it a try and come back to you.
Update: This solved the issue und works pretty well, including an external hash file. Thanks for the support.
I am collecting some experience with duperemove on btrfs and xfs. Right now I'm on btrfs. My test case is setup like this:
[user@server nextcloud]# ll data/user/files/folder/largefile.mp4 -rw-r--r-- 1 apache apache 1,696,283,207 Feb 6 2021 'data/user/files/folder/largefile.mp4' [user@server nextcloud]# ll users/user/folder/largefile.mp4 -rw-rw-r-- 1 user users 1,696,283,207 Feb 6 2021 'users/user/folder/largefile.mp4'
If I run
duperemove -rh /daten/nextcloud
it won't dedupe the extents.But
fdupes -r /daten/nextcloud|duperemove --fdupes
will do.I will restrict my test runs later and will add the actual output here.