Open sameer opened 3 months ago
This kinda sounds like a bug I encountered a few months back, that was triggered, when the following set of conditions were met:
Due to some API inconsistency, the return value was EMPTY
instead of OUTPUT_MORE
, thus my driver code for compression ended up inadvertently dropping the last byte of the output. It didn't cause an infinite loop, but rather an early termination of the encoder. The example implementation of the CLI utility is not affected; and also was, what in the end allowed me to locate my encoder bug.
It didn't cause an infinite loop, but rather an early termination of the encoder.
That's the behavior I noticed in a Rust port I'm writing fix ref. But I wasn't totally convinced there was no loop in the C implementation here.
Thanks. I'll make sure this is fixed in the next release.
I also encountered this during some fuzzing on a Rust port. My st_step_search impl looks like this now
fn st_step_search(&mut self) -> HSEState {
let window_length = self.input_buffer_size;
let lookahead_sz = self.lookahead_size;
let msi = self.match_scan_index;
if msi > self.input_size.saturating_sub(lookahead_sz) {
return HSEState::SaveBacklog;
} else if unlikely(self.is_finishing()) && msi >= self.input_size {
return HSEState::FlushBits;
}
let input_offset = self.get_input_offset();
let end = input_offset + msi;
let start = end - window_length;
...
Via some fuzzing, I found a finish scenario that can result in incorrect encoding. This PR addresses the bug by adding a check for
finishing && input size == 0
in the search step to transition to flush bits.Explanation:
At finish time, if the
input_size
is 0 then there is no more data to encode and the remaining bits should be flushed. However, because the search step doesinput_size - 1
this results in an overflow and the search step transitions to yielding a tag bit instead. I'm not really sure what happens from this point on, but I suspect it may loop infinitely.