The *_pipelined_* benchmarks use the new pipelined parallel extraction method, and the *_compressible_big benchmarks demonstrate almost a 5x speedup, while the *_random benchmarks demonstrate a 1.4x speedup. Note that the *_compressible_small benchmark is slower in the pipelined case, but this is such a small input that we actually lose very little.
or requires the reader to implement Clone by storing File to be stored inside an Arc and keep track of the curent location of cursor in the reader
TODO
As mentioned above, this also loses performance against small inputs. I think a fully async approach with the async-executor crate might be a much cleaner approach than trying to scale our rayon threadpools up and down according to the size of the input.
One attempt to fix zip-rs/zip2#165.
Upsides
Lots and lots faster when extracting zips with many separate entries, or with large highly compressed individual entries:
The
*_pipelined_*
benchmarks use the new pipelined parallel extraction method, and the*_compressible_big
benchmarks demonstrate almost a 5x speedup, while the*_random
benchmarks demonstrate a 1.4x speedup. Note that the*_compressible_small
benchmark is slower in the pipelined case, but this is such a small input that we actually lose very little.Downsides
This brings in
rayon
and a few other dependencies which we would probably want to assign to a flag. As @NobodyXu mentioned in https://github.com/zip-rs/zip/issues/403#issuecomment-1712451398, this also imposes aClone
requirement on the reader:TODO
As mentioned above, this also loses performance against small inputs. I think a fully async approach with the
async-executor
crate might be a much cleaner approach than trying to scale our rayon threadpools up and down according to the size of the input.