Open tmhannes opened 2 years ago
I like the idea, multi-thread operations are indeed suboptimal. Thanks for your proposal!
I have a few concerns about the implementation:
exfat_fptr
s need to be chained?exfat_fptr
argument makes the code prone to errors: if we pass it to read/write functions but forget to pass to truncate, the fptr will become invalid and can corrupt data.I like the idea, multi-thread operations are indeed suboptimal. Thanks for your proposal!
I have a few concerns about the implementation:
exfat_fptr
s need to be chained?exfat_fptr
argument makes the code prone to errors: if we pass it to read/write functions but forget to pass to truncate, the fptr will become invalid and can corrupt data.
- Why
exfat_ptr
s need to be chained
The exfat_ptr
s are chained so that shrink_file
and grow_file
(both called from exfat_truncate
) can find all of the exfat_ptr
s that point to the file they are working with (by following the chain from the fptr
member of exfat_node
), so that they can adjust them when necessary.
My impression is that this is required when, for example, a process A has a file open and then a process B truncates it, so that process A is not left holding an invalid exfat_ptr
.
- The optionality of
exfat_fptr
argument makes the code prone to errors: if we pass it to read/write functions but forget to pass to truncate, the fptr will become invalid and can corrupt data.
It's not necessary to pass the exfat_fptr
to truncate at all, because truncate will find all the exfat_ptr
s that exist by itself (see above).
I agree that the optional exfat_ptr
argument to exfat_advance_cluster
is weird though. Perhaps it would be cleaner to require every caller to provide an exfat_ptr
(or an exfat_fh
(which contains both exfat_ptr
and exfat_node
))?
I didn't pursue that option because I wanted to keep the patch as small as possible. Requiring every caller of exfat_advance_cluster
to provide an exfat_ptr
would cause changes rippling out into a few more sections of code.
This PR addresses #181 by adding a separate "fptr" for each open file handle, so that sequential reads never have to restart from the beginning of the file.
In light testing on an i7 CPU with an external SSD, the PR allows concurrent readers at different positions in a large file to achieve the same total read throughput and CPU usage as a single reader sequentially processing the whole file.
Any feedback would be gratefully received.