Open yamt opened 9 months ago
Hi @yamt, thanks for opening an issue.
I'm glad you proposed this, because I only learned pread/pwrite are in standard POSIX recently. It makes sense as an API and I think it would be good to add.
pread/pwrite also shouldn't add much code cost as most of the logic should be shared behind the scenes.
it's likely inefficient as littlefs's seek also moves cache position
The reason littlefs's seek moves the cache is because we usually need it for reading/writing, so I don't think pread/pwrite will escape this cost. Maybe when pread/pwrite bypasses the cache.
That being said, non-disk-format related API changes are low priority at the moment. So it may take some time for this to land.
it's likely inefficient as littlefs's seek also moves cache position
The reason littlefs's seek moves the cache is because we usually need it for reading/writing, so I don't think pread/pwrite will escape this cost. Maybe when pread/pwrite bypasses the cache.
i meant that, a straightforward emulation with lseek like the following would be inefficient as it moves the position back and forth. a native implementation can be better.
lfs_file_pread(off) {
orig_pos = lfs_file_seek(CUR);
lfs_file_seek(off, SET);
lfs_file_read();
lfs_file_seek(orig_pos, SET);
}
A native implementation could save a CTZ skip-list lookup, that's true. And there may be other opportunities for minor optimizations.
But caching is a bit of a harder problem, since lfs_file_pread
doesn't know what the next operation will be. It may be followed by a pread
immediately after if above layers emulate read
with multiple pread
calls, for example.
But caching is a bit of a harder problem, since
lfs_file_pread
doesn't know what the next operation will be. It may be followed by apread
immediately after if above layers emulateread
with multiplepread
calls, for example.
are you concerning cache-hit rate here?
Ah yeah. Cases like these:
// write blob
pwrite(f, 0, <data_blob>)
// scan for start of payload
lfs_off_t payload_start = 0;
while (true) {
pread(f, payload_start, buf, strlen("<header>"));
if (strcmp(buf, "<header>") == 0) {
break;
}
payload_start += 1;
}
You could argue this should be done with seek
+ read
, but I'm not sure that's always possible when other OS layers get involved. FUSE for example only talks in pread
/pwrite
(fuse_lowlevel_ops.read).
Though, as an "optimization problem", there may not be a single correct impl here.
Though I suppose you could also argue the FUSE impl should still call seek
+ read
if only for cache reasons.
Maybe the best option is unclear. Not moving the cache in pread
/pwrite
at least gives users more control, even though the performance differences may be initially confusing to users.
it would be nice to support pread/pwrite natively.
while an emulation with seek is possible, i guess it can have a few drawbacks: