PSeitz / lz4_flex

Fastest pure Rust implementation of LZ4 compression/decompression.
MIT License
441 stars 28 forks source link

increase unsafe compression speed by 3% #85

Closed PSeitz closed 1 year ago

PSeitz commented 1 year ago

Remove some code which is used only in a rare edge case to increase compression speed. When counting the same bytes, the 7 last bytes of the input are now ignored.

Used tool: cargo asm + llvm-mca shows increased throughput cargo asm --release --no-default-features --features checked-decode --example compress_block compress_internal --mca-intel | grep Through

The same optimization can't be used for the safe version (removing the cold marked internal function). It produces (not obviously) worse assembly for some reason. That can be seen via the tool and in the benchmark.

codecov[bot] commented 1 year ago

Codecov Report

Merging #85 (c2d7613) into main (9e77c40) will decrease coverage by 0.14%. The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main      #85      +/-   ##
==========================================
- Coverage   90.81%   90.67%   -0.14%     
==========================================
  Files          11       11              
  Lines        2144     2112      -32     
==========================================
- Hits         1947     1915      -32     
  Misses        197      197              
Impacted Files Coverage Δ
src/block/compress.rs 99.82% <100.00%> (-0.01%) :arrow_down:
src/block/decompress.rs 95.46% <100.00%> (-0.03%) :arrow_down:

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.