I'm de-duplicating thousands of large files (3-15GB each), and assuming the size, first and last checksum match, there's a very high probability that the file is the same.
It only takes ~1min to get past first/last checksum, but would take hours to get through the scan.
Could fclones provide an option to stop there and finish? And/or, could there be a "random sample" approach taken where files of matching size deterministically hash say another 5 ranges of their contents to increase confidence, but without nearing the demand of reading the entire file?
I'm de-duplicating thousands of large files (3-15GB each), and assuming the size, first and last checksum match, there's a very high probability that the file is the same.
It only takes ~1min to get past first/last checksum, but would take hours to get through the scan.
Could fclones provide an option to stop there and finish? And/or, could there be a "random sample" approach taken where files of matching size deterministically hash say another 5 ranges of their contents to increase confidence, but without nearing the demand of reading the entire file?