sharkdp / hexyl

A command-line hex viewer
Apache License 2.0
8.92k stars 227 forks source link

experiment: Sparse file support #185

Open sharifhsn opened 1 year ago

sharifhsn commented 1 year ago

I decided to take a stab at #89 just to see how feasible it is. In order to facilitate this, I have temporarily changed the API of print_all to take Input with a new generic Box <dyn Read> variant as a fallback.

So far, this implementation is only able to use sparseness when there is non-sparse content at the end of the file, and the rest of the file, including the start, is sparse. Otherwise, it will cause strange errors.

It also only works for Linux, because this was the simplest OS to implement sparse support for.

How to test this:

truncate -s 16G startzero
echo "Lorem ipsum" >> startzero
cargo run --release -- startzero

The position will be slightly wrong but it will finish extremely quickly.

Tasks:

sharkdp commented 1 year ago

I decided to take a stab at #89 just to see how feasible it is. In order to facilitate this, I have temporarily changed the API of print_all to take Input with a new generic Box <dyn Read> variant as a fallback.

Just a few comments for now:

sharifhsn commented 1 year ago

If you are interested in the problem itself - great! But I just want to warn you that this might very well turn out to be too complex (or too risky in terms of bugs) for an actual merge.

Yeah, I'm pretty aware of this, at least at my progress so far. That's why I'm labeling this PR as an experiment, I just want to see how possible it is. At least so far, it seems to work fine on Linux (with a major caveat that I'm trying to work around at the moment).

I hadn't looked into nix before, only rustix. It seems that nix supports SEEK_DATA while rustix doesn't, so I think I'll replace this use of libc with nix (and also maybe raise an issue for rustix).