Closed juliantaylor closed 8 years ago
ctx->same is
logical_offset =85563016
length = 0
dest_count = 1
the length 0 seems to be the cause of the endless loop as it will cause status = 0 and bytes_deduped = 0
Question - was this working for you previously? I wonder if it was a recent change I made to dedupe.c, 9430adc6b20667cff72bc2cf31d7621b7ace76bb
EDIT: just to keep note here, I tried it with a pair of 1 gigabyte files and things went ok, I'll try with some larger ones on a different kernel (this one has my fixes from the btrfs mailing list)
could be that the endless loop is new, but I had duperemove leak kernel memory but not as much before but couldn't pinpoint it. Possibly this is the same issue. Perhaps also related to gh-42?
found the cause of the length 0, set_aligned_same_length is wrong when the filesize is a larger than 4gb it sets the len to zero as the mask is a 32 bit integer (fs blocksize is 4096)
that explains why it goes into an endless loop, but I still have no clue about the leak can you try with a large file and the ctxt->len set to zero (so reproducing the the endless loop)
is there a kfree of same missing in btrfs_ioctl_file_extent_same?
Huh, I believe so - that's a nice catch. I'll check it out and if it is a leak indeed the fix is pretty easy.
I have a potential fix for the endless loop scenario in issue#83 branch, would you mind giving it a go?
the fix works and it now finishes but of course still leaks memory.
Thanks, yeah the memory leak is going to require a kernel patch I'll update with details as I get them.
a patch has been posted on the list: http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg44488.html
Thanks for the pointer, I went ahead and sent him a review of the patch.
I'd recommend to merge that simple duperemove change soon, as it can kill a machine if you encounter a files sized by multiples of 4gb
It's been merged into master branch, do you not see the fix working for you? (this isn't a fix for the kernel memory leak of course)
0a9771f59daba95bae6ead2cafccdf0205279c88
Closing as this all should be fixed upstream and in duperemove.git now
using duperemove --fdupes on a 3.19 kernel (ubuntu 15.04 kernel) seems to go into an endless loop in duperemove issuing lots of IOC_FILE_EXTENT_SAME ioctls. These also seem to leak memory so after a short while the machine crashes.
the issue seems reproduceable when it reaches two 8.8GB large files to dedupe.
the function that seems to be stuck in a loop is
dedupe_extents
before thebtrfs_extent>same
call the values of ctxt are:after the call the fields are unchanged
this ctxt then seems to be inserted into the list again in process_dedupes and repeats until the machine is out of RAM. but it is not duperemove that uses the ram, it seems to be lost inside the kernel itself.