Closed markbt closed 4 years ago
I wasn't aware of this. I did something similar at https://github.com/quark-zju/streampager/commit/305b3ac7873a4fadf17c80a073f38d25a822c64a. It has some caching support.
How much does/would caching help? The kernel is already doing all that for you, so all you're saving is the cost of the syscalls themselves. There can't be that many of them since you'd be limited by the user's reading rate (and bulk operations like search can amortize the cost with large reads).
I'm not sure. Each line will trigger a read
call. It could be 100+ lines. Search also seems to trigger one read
per line due to the current API design.
I added cache support, so now this is probably ready to go. Any input welcome before I merge it.
The notify behaviour seems a bit unreliable, but when it works, it's quite nice.
Implement a new reader for on-disk files, that reads the data out of the file rather than mmapping it.
It's a bit simple right now, so it reloads each line from the file whenever it wants to display it. It would be better to go through some kind of LRU block cache for loading chunks of the file. The cache can be flushed whenever we detect a reload is necessary.
The heuristic for append-vs-reload is whether the last 4k of the file has changed or not. Also, any time we try to parse a line and fine a newline in the middle of the line, we trigger a full reload.
This should help with #8 and #9, as we now watch the file and load new contents if the file is appended to, or reload the file if the contents change or the file is truncated.
The mmap implementation is retained in case it will be useful, but for now is unused.