markfasheh / duperemove

Tools for deduping file systems
GNU General Public License v2.0
823 stars 82 forks source link

Does duperemove really unshare extents? #189

Open darkbasic opened 7 years ago

darkbasic commented 7 years ago

https://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg68506.html https://www.spinics.net/lists/linux-btrfs/msg69830.html

Someone in the mailing list told me that duperemove will unshare my snapshots, is it true?

I want snapshots to be deduplicated as well, but since I cannot deduplicate them (because they are read-only), then the data must be already deduplicated before the snapshots are taken. This means that I have to run duperemove each night, but unfortunately I noticed that each time it runs the free space decreases instead of increasing. This is because of my autodefrag mount option or because duperemove really unshares my snapshots? I tried with -A, but besides taking much longer it didn't improve the situation.

Thanks

goldwynr commented 7 years ago

It should not unshare snapshots. However, there is a case which I found which would not do what you expect it to do. In this case, you would expect duperemove to share f2's extents with f3, but it does the reverse. So, I am trying to find a way to map to the ones which are already shared using the fiemap details reported.

!/bin/bash

TESTDIR=/testdir DEV=/dev/sdb9

mkfs.btrfs -f $DEV mount $DEV $TESTDIR

xfs_io -f -c "pwrite 0 1G" $TESTDIR/f1 xfs_io -f -c "pwrite 0 1G" $TESTDIR/f2

sync xfs_io -c "fiemap -v" $TESTDIR/f1 xfs_io -c "fiemap -v" $TESTDIR/f2

/usr/sbin/duperemove -v -d -r --dedupe-options=noblock $TESTDIR/f1 $TESTDIR/f2 xfs_io -f -c "pwrite 0 1G" $TESTDIR/f3 sync

xfs_io -c "fiemap -v" $TESTDIR/f1 xfs_io -c "fiemap -v" $TESTDIR/f2 xfs_io -c "fiemap -v" $TESTDIR/f3 /usr/sbin/duperemove -v -d -r --dedupe-options=noblock --debug $TESTDIR/f2 $TESTDIR/f3

xfs_io -c "fiemap -v" $TESTDIR/f1 xfs_io -c "fiemap -v" $TESTDIR/f2 xfs_io -c "fiemap -v" $TESTDIR/f3

goldwynr commented 7 years ago

Could you try duperemove from my repo (branch reuse-shared)? https://github.com/goldwynr/duperemove/tree/reuse-shared

leijurv commented 6 years ago

Is it safe to run that branch? I have the same issue.