PSeitz / lz4_flex

Fastest pure Rust implementation of LZ4 compression/decompression.
MIT License
441 stars 28 forks source link

perf: use input_prt instead of input_pos, improve checks #82

Closed PSeitz closed 1 year ago

PSeitz commented 1 year ago

Improve checked-decode performance by 3-15%

input_ptr instead input_pos. This has no observable impact on performance. It produces smaller assembly though.

Merge bounds checks. There were two bounds checks in the hot loop. Due to the limited values of the hotloop we can merge the bounds checks into one.

codecov[bot] commented 1 year ago

Codecov Report

Merging #82 (ec51b4d) into main (7b1d51d) will decrease coverage by 0.04%. The diff coverage is 95.34%.

@@            Coverage Diff             @@
##             main      #82      +/-   ##
==========================================
- Coverage   90.81%   90.78%   -0.04%     
==========================================
  Files          11       11              
  Lines        2133     2126       -7     
==========================================
- Hits         1937     1930       -7     
  Misses        196      196              
Impacted Files Coverage Δ
src/block/decompress.rs 95.54% <95.34%> (-0.09%) :arrow_down:

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.