markfasheh / duperemove

Tools for deduping file systems
GNU General Public License v2.0
794 stars 78 forks source link

2nd run of dedupe takes too long #266

Closed syrop closed 1 year ago

syrop commented 3 years ago

I used the command:

time sudo duperemove -dhr /mnt/zbiornik --hashfile /home/wiktor/hashfile2

right after the previous use of the same command finished running. The second run has already been running for several hours. It looks like it is deduping the same extents again.

Sometimes it does say:

Skipping - extents are already deduped.

but I believe it should just skip all contents of the hard drive, as it has already been deduped, and finish running in a handful of minutes.

Why running duperemove for the second time takes several hours?

lorddoskias commented 3 years ago

Which version of duperemove are you using? There was a "hang" bug which got fixed in https://github.com/markfasheh/duperemove/commit/fd2011333c1a55173669d9440655b045af0a96a6 and is part of 0.11.2

syrop commented 3 years ago

I am using 0.11.2. Nothing has changed. When I run dedupe 2nd time, it tries to dedupe the extents that were deduped before.

Junker commented 2 years ago

I'm confirming. Second run is deduping the same extents again using version 0.11.3

Junker commented 2 years ago

I'm not sure, but i think this starts to happen after i convert metadata to dup mode with "btrfs balance start -mconvert=dup". Can this affect the result or not?

Junker commented 2 years ago

i found out the problem: duperemove is deduping the same extents again only if hash file is used! if i start duperemove without hash file, i see only "Skipping - extents are already deduped".

MarkusNemesis commented 2 years ago

I too have this issue, where a fully deduped file system will then go on to 'dedupe' the exact number of extents again, when using a hashfile. I also find that whenever it completes deduping, the process will hang idle, instead of exiting. Running 0.11.2.

JackSlateur commented 1 year ago

Hello,

Improvements have been made regarding this issue Could you check with the latest code ?

syrop commented 1 year ago

Confirmed in version 0.12. When I run duperemove for the second time, it just says "Nothing to dedupe." and quits. Well done!