trifectatechfoundation / zlib-rs

A safer zlib
zlib License
139 stars 15 forks source link

Loop plus match #248

Closed folkertdev closed 1 week ago

folkertdev commented 1 week ago

switch from tail calls to loop+match

It turns out that even if we ask LLVM nicely, it will not actually make our collection of functions, all with the same signature, tail call one another.

After finding stack overflows with a fuzzer, we dove into the assembly, and found some cases where the stack grew

.LBB109_326:
    mov rdi, rbx
    call zlib_rs::inflate::State::type_do
    jmp .LBB109_311

.LBB109_311:
    lea rsp, [rbp - 40]
    pop rbx
    pop r12
    pop r13
    pop r14
    pop r15
    pop rbp
    .cfi_def_cfa rsp, 8
    ret

LLVM wants to centralize the cleanup before the return (many other blocks jump to LBB109_311), thereby invalidating a tail call to type_do.

We were not able to get rid of this call without introducing one elsewhere: we just don't currently have the power to tell LLVM what we want it to do.

So, we switch back to loop+match waiting for changes to rust to make a more efficient implementation possible.

Performance-wise, the damage is relatively minimal: we're just slower in cases where we already were slower than C. We are faster in cases where the relevant code is barely touched (in these cases the logic quickly moves into a hot inner loop and just spends most of its time there).

codecov[bot] commented 1 week ago

Codecov Report

Attention: Patch coverage is 95.93220% with 36 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
zlib-rs/src/inflate.rs 96.00% 28 Missing :warning:
test-libz-rs-sys/src/inflate.rs 95.65% 8 Missing :warning:
Files with missing lines Coverage Δ
test-libz-rs-sys/src/inflate.rs 97.27% <95.65%> (-0.18%) :arrow_down:
zlib-rs/src/inflate.rs 95.43% <96.00%> (+0.36%) :arrow_up:

... and 2 files with indirect coverage changes