Just wanted to pass on that I converted a filesystem to btrfs specifically to try out the deduplication feature. I knew there were a fair amount of duplicates caused by my multiple checkouts of branches of a giant software project.
I had run scan several times, and it reported that there were no changes to the filesystem, so it wasn't going to scan any more.
show reported that at least 8MB would be freed.
However, dedup managed to free ~24GB.
I'm aware that my kernel may have some issue that interferes with bedup.
Here is the data, in case it is useful:
tomcat7@meatcar:/data/opengrok$ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=12.04
DISTRIB_CODENAME=precise
DISTRIB_DESCRIPTION="Ubuntu 12.04.4 LTS"
tomcat7@meatcar:/data/opengrok$ uname -a
Linux meatcar 3.2.0-60-generic #91-Ubuntu SMP Wed Feb 19 03:54:44 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
esmith@meatcar:~/dev/bedup$ sudo python -m bedup show
Label: None UUID: f5cfdb58-19ad-4840-993a-2458eb032dac
Device: /dev/sde1
Volume 5
As of generation 1095, tracking 1240 inodes of size at least 8388608
Accessible at /data
df before
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sde1 292967724 188674128 70485728 73% /data
dedup output
06:11.0 Size group 170/170 (8390605) sampled 853 hashed 838 freed 23661522715
00.00 Committing tracking stateNo handlers could be found for logger "sqlalchemy.pool.SingletonThreadPool"
00.02 Committing tracking state
df after
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sde1 292967724 165535232 87075656 66% /data
inodes of size at least 8388608 refers to the minimum size of tracked inodes, because smaller files aren't tracked (see --size-cutoff). bedup doesn't compute an estimate, it just deduplicates on the fly.
Just wanted to pass on that I converted a filesystem to btrfs specifically to try out the deduplication feature. I knew there were a fair amount of duplicates caused by my multiple checkouts of branches of a giant software project.
I had run scan several times, and it reported that there were no changes to the filesystem, so it wasn't going to scan any more.
show reported that at least 8MB would be freed. However, dedup managed to free ~24GB.
I'm aware that my kernel may have some issue that interferes with bedup.
Here is the data, in case it is useful:
tomcat7@meatcar:/data/opengrok$ cat /etc/lsb-release DISTRIB_ID=Ubuntu DISTRIB_RELEASE=12.04 DISTRIB_CODENAME=precise DISTRIB_DESCRIPTION="Ubuntu 12.04.4 LTS" tomcat7@meatcar:/data/opengrok$ uname -a Linux meatcar 3.2.0-60-generic #91-Ubuntu SMP Wed Feb 19 03:54:44 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
esmith@meatcar:~/dev/bedup$ sudo python -m bedup show Label: None UUID: f5cfdb58-19ad-4840-993a-2458eb032dac Device: /dev/sde1 Volume 5 As of generation 1095, tracking 1240 inodes of size at least 8388608 Accessible at /data
df before Filesystem 1K-blocks Used Available Use% Mounted on /dev/sde1 292967724 188674128 70485728 73% /data
dedup output 06:11.0 Size group 170/170 (8390605) sampled 853 hashed 838 freed 23661522715 00.00 Committing tracking stateNo handlers could be found for logger "sqlalchemy.pool.SingletonThreadPool" 00.02 Committing tracking state
df after Filesystem 1K-blocks Used Available Use% Mounted on /dev/sde1 292967724 165535232 87075656 66% /data