sile / libflate

A Rust implementation of DEFLATE algorithm and related formats (ZLIB, GZIP)
https://docs.rs/libflate
MIT License
180 stars 35 forks source link

Drop unsafe code from currently unsound function prefix() #30

Closed Shnatsel closed 5 years ago

Shnatsel commented 5 years ago

Fixes #29

This changes the generated assembly from

 movzx   ecx, byte, ptr, [rdi]
 movzx   eax, byte, ptr, [rdi, +, 1]
 movzx   edx, byte, ptr, [rdi, +, 2]
 shl     edx, 16
 shl     eax, 8
 or      eax, ecx
 or      eax, edx
 ret

to

 push    rax
 cmp     rsi, 2
 jbe     .LBB106_1
 movzx   ecx, byte, ptr, [rdi]
 movzx   eax, byte, ptr, [rdi, +, 1]
 movzx   edx, byte, ptr, [rdi, +, 2]
 shl     edx, 16
 shl     eax, 8
 or      eax, ecx
 or      eax, edx
 pop     rcx
 ret
.LBB106_1:
 mov     edi, 3
 call    qword, ptr, [rip, +, _ZN4core5slice20slice_index_len_fail17hbeb3bb6238e26ddbE@GOTPCREL]
 ud2

which is as good as it's going to get without restructuring the code.

This change incurs at about 1% performance penalty. This was tested without inlining, which may elide bounds checks further. Performance penalty can probably be avoided altogether by using slice::windows() plus one of the ways to skip iterator elements in function flush() instead of indexing, but I'm not sure it's worth the trouble.

Shnatsel commented 5 years ago

FWIW I've also tried the following:

    let mut result = [0, 0, 0];
    result.copy_from_slice(&input_buf[..3]);
    result

which yielded the exact same assembly.

codecov-io commented 5 years ago

Codecov Report

Merging #30 into master will increase coverage by 0.01%. The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #30      +/-   ##
==========================================
+ Coverage   91.44%   91.46%   +0.01%     
==========================================
  Files          14       14              
  Lines        1297     1300       +3     
==========================================
+ Hits         1186     1189       +3     
  Misses        111      111
Impacted Files Coverage Δ
src/lz77/default.rs 89.1% <100%> (-0.22%) :arrow_down:
src/deflate/encode.rs 96.62% <0%> (+0.02%) :arrow_up:
src/deflate/symbol.rs 93.06% <0%> (+0.02%) :arrow_up:
src/huffman.rs 94% <0%> (+0.06%) :arrow_up:
src/non_blocking/deflate/decode.rs 86.85% <0%> (+0.07%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 14c7627...55c89a9. Read the comment docs.