markfasheh / duperemove

Tools for deduping file systems
GNU General Public License v2.0
816 stars 81 forks source link

duperemove-dev fails to find duplicates (0.11.1 works though) #224

Closed matthiaskrgr closed 4 years ago

matthiaskrgr commented 5 years ago

I was checking my entire disk and duperemove was not able to dedupe anything which seemed suspicious.

I tried out cat bigfile > copy; duperemove . but duperemove did not dedupe anything

Total files:  5
Total extent hashes: 0
Loading only duplicated hashes from hashfile.
Found 0 identical extents.
Simple read and compare of file data found 0 instances of extents that might benefit from deduplication.
Nothing to dedupe.

so I'm wondering if it is still working. :/

edit: I'm on linux 5.3.7 / btrfs-progs v5.3

Palando commented 4 years ago

For me it also does not work anymore. I'm using xfsprogs 5.0.0-1.4 and linux 5.3.7-1 on OpenSUSE. It found 0 identical extents but before I copied a file.

dominikholler commented 4 years ago

While the current master branch does not work for me, v0.11.1 works on CentOS 8 for me.

Palando commented 4 years ago

Ah, yes. For me too. I just gave it a try. Thank you!

tux3 commented 4 years ago

I just ran into this issue on master. I've only glanced at the code but I think this might be because the last commit (128acd99fc4ff1c6735083ffd69951ba9d7c997e) causes csum_extent to return 0 when csum_by_block/csum_by_extent still interpret 0 as EOF, so it just gives up on each file after trying the first extent. Edit: Well, there's probably a deeper issue still. If I checkout the second-to-last commit it finds two duplicate extents on exact test copies. That's better than 0, but less than the 75 I expected.

veganvelociraptor commented 4 years ago

I can confirm this issue still persists using the latest master. Using kernel v5.4.2 and xfsprogs v5.2.1,

warthog9 commented 4 years ago

Quick bisect, c61a144a5f0c0a546a31340808bab122c58b2306 is where this stops working on dev likely (at least for me)

warthog9 commented 4 years ago

Also noting this is where dev has --dedupe-options= which is likely what you are seeing as a bug, this may just be a miss-understanding of the new options

veganvelociraptor commented 4 years ago

If so, what would be the proper way to use the tool in order to duplicate (no pun intended) the previous behaviour?

jfikar commented 4 years ago

I think it should be --dedupe-options=partial,same but it does not work. See #216

zhangboyang commented 4 years ago

I have encountered the same problem.

Leicas commented 4 years ago

Same issue for me.

lorddoskias commented 4 years ago

Can you retest latest code in master. I believe this should be fixed by 30e5c7a7fd502d55069977f2037b2c26a3aff684

matthiaskrgr commented 4 years ago

Looks like its working again :)

lorddoskias commented 4 years ago

In this case I will close this issue if there are anymore problems keep reporting them!