One attempt to fix zip-rs/zip2#165.

Upsides

Lots and lots faster when extracting zips with many separate entries, or with large highly compressed individual entries:

> cargo bench -- extract
# ...
running 6 tests
test extract_pipelined_compressible_big   ... bench:  81,470,730 ns/iter (+/- 6,342,681) = 449 MB/s
test extract_pipelined_compressible_small ... bench:     716,842 ns/iter (+/- 256,469) = 8 MB/s
test extract_pipelined_random             ... bench:  23,833,197 ns/iter (+/- 15,791,334) = 424 MB/s
test extract_sync_compressible_big        ... bench: 335,936,236 ns/iter (+/- 6,129,233) = 109 MB/s
test extract_sync_compressible_small      ... bench:     168,939 ns/iter (+/- 7,269) = 37 MB/s
test extract_sync_random                  ... bench:  33,435,824 ns/iter (+/- 1,222,875) = 302 MB/s

The *_pipelined_* benchmarks use the new pipelined parallel extraction method, and the *_compressible_big benchmarks demonstrate almost a 5x speedup, while the *_random benchmarks demonstrate a 1.4x speedup. Note that the *_compressible_small benchmark is slower in the pipelined case, but this is such a small input that we actually lose very little.

Downsides

This brings in rayon and a few other dependencies which we would probably want to assign to a flag. As @NobodyXu mentioned in https://github.com/zip-rs/zip/issues/403#issuecomment-1712451398, this also imposes a Clone requirement on the reader:

or requires the reader to implement Clone by storing File to be stored inside an Arc and keep track of the curent location of cursor in the reader

TODO

As mentioned above, this also loses performance against small inputs. I think a fully async approach with the async-executor crate might be a much cleaner approach than trying to scale our rayon threadpools up and down according to the size of the input.

zip-rs / zip-old

Pipelined parallel extract #407

Upsides

Downsides

TODO