markfasheh / duperemove

Tools for deduping file systems
GNU General Public License v2.0
832 stars 82 forks source link

Repeated execution on xfs #280

Closed MichaelDietzel closed 1 year ago

MichaelDietzel commented 2 years ago

I have a 128giB raw VM image on xfs with some holes (about 25giB) that I want to dedupe to remove duplication inside it.

So I run duperemove for the first time on it duperemove -hdrq -b4k --dedupe-options=same,partial --skip-zeroes win.raw It takes about 24min and results in some deduplication (I cannot tell how much, though, sorry. The output filled my console and so I cannot look back at the reported size before. Should have written the output to a file...). I get a lot of error messages despite the file not being opened by any other program: Dedupe for file "/mnt/vmstorage/win.raw" had status (1) "data changed".

Shortly after that and without any changes to the file I run duperemove a 2nd time with identical parameters. This time it takes about 60 mins. It results in additional savings of nearly 3giB. Also again I get a lot of the same error messages.

Now there are some things that I am wondering:

Some more info:

JackSlateur commented 1 year ago

Hello,

2nd runs on unchanged data have been improved Could you check with the latest code ?