Open kgermanov opened 2 years ago
@szakacsits What do you think about this?
AFAICT this does not work for a block device. You should check whether you are mounting a regular file.
@jpandre Usually block device should be handled by some ioctl codes. But anyway we should handle the situation, when it cannot, thx.
@jpandre Is it ok now?
@kgermanov While trusting 'stat' is usually fine, the binary search method is more reliable with regard to the actual underlying file characteristics and it doesn't require that many calls. In addition it ensures that the entire length of the file can be accessed (e.g. it's not corrupted in the file system). Can you explain in what way the binary search method is problematic for you?
@unsound For 2TB disk it requred about log(2*1024*1024*1024*1024) = 41
calls. But most problem in that this calls are randomly over whole disk (Did not work buffered IO).
If underlying fs does not fast (for example on slow NFS) there is may be problem.
Corrupted file system can break read calls for any chunk, binary search does not provide guarantees.
Maybe it's just me but 41 I/O requests doesn't seem like much, even random ones, compared to what the driver would issue during normal filesystem operation. How much does your fix speed up mounting in this particular scenario?
The particular corruption I'm thinking about is when internal extents end before logical ones, and also some filter filesystems can choose to expose a device as a 0-byte (indeterminate) file, which if we apply this patch couldn't be opened.
@unsound My case was extremly corner: each read's call take 1s. In this situation impove from 1 min to 5 sec for 2TB file for mount. We can do binary search if stat return size less than 512.
AFAICT ntfs_device_size_get() is not used while mounting (mounting relies on data stored in the boot sector). It is only used by a few ntfsprogs : mkntfs, ntfsclone, ntfsfix, ntfslabel and ntfsresize.
@jpandre Ah, I could have sworn I've seen it used in the library code as a sanity check but I may have confused it with another project. Then the impact is much smaller than I thought initially.
For avoid random read we can use IO callback for get size from struct stat. Issue: https://github.com/tuxera/ntfs-3g/issues/46