markfasheh / duperemove

Tools for deduping file systems
GNU General Public License v2.0
795 stars 80 forks source link

Handle overflow in csum_extent #245

Closed ericzinnikas closed 3 years ago

ericzinnikas commented 3 years ago

I came across the same issue as in #241 -- turns out total_bytes_read (unsigned int) was written into ret (int) and overflowed on large files. It would seem in the two places that call csum_extent, they only care if value is negative (error), zero (EOF), or positive (continue) and not "how positive" it is -- so I've just added a simple check to fix the return code and send back a 1 instead of total_bytes_read.

lorddoskias commented 3 years ago

How reproducible is this? When I was reading the initial report and went to check the code I had the same hypothesis, however csum_extent works on at most MAX_EXTENT bytes. MAX_EXTENT would be the maximum size of extent a filesystem supports, for btrfs for example, this value is 128m so it can never overflow. Admittedly I don't know what is the max size of the extents on xfs (the only other supported fs), so I wrote a 4.5g file via xfs_io -f -c "pwrite 0 4.5g" which is simply sequential accesses. And filefrag reports the following:

filefrag -v mnt-test/large-file 
Filesystem type is: 58465342
File size of mnt-test/large-file is 4404019200 (1075200 blocks of 4096 bytes)
 ext:     logical_offset:        physical_offset: length:   expected: flags:
   0:        0..  392812:     393225..    786037: 392813:            
   1:   392813..  785625:    1179657..   1572469: 392813:     786038:
   2:   785626.. 1075199:         20..    289593: 289574:    1572470: last,eof
mnt-test/large-file: 3 extents found

The first 2 extents are 1608957952 in size which is around 1534 megabytes. An int should be able to hold at most 2,147,483,647 as the signed value. Based on my experiment above it seems that even xfs is not able to create an extent which is larger than 2 billion bytes. So if you can reproduce the issues can you show me the output of filefrag -v /path/to/file to confirm that this is indeed what is happening.

lorddoskias commented 3 years ago

Ok, I managed to reproduce it. XFS can create up to 2^21 block-sized extents so it's indeed possible to overflow ret. So the code is good.

ericzinnikas commented 3 years ago

Thanks for verifying & reproducing. Yeah, I was seeing it consistently on an XFS filesystem.