Frommi / miniz_oxide

Rust replacement for miniz
MIT License
176 stars 49 forks source link

miniz_oxide::inflate::stream::inflate consumes more bytes than it actually used #158

Open link2xt opened 5 days ago

link2xt commented 5 days ago

Here is a failing test:

#[test]
fn partial_decompression_imap() {
    use miniz_oxide::inflate::stream::{inflate, InflateState};
    use miniz_oxide::{DataFormat, MZFlush};

    // Decompresses to
    // "* QUOTAROOT INBOX \"User quota\"\r\n* QUOTA \"User quota\" (STORAGE 76 307200)\r\nA0001 OK Getquotaroot completed (0.001 + 0.000 secs).\r\n"
    let input = vec![
        210, 82, 8, 12, 245, 15, 113, 12, 242, 247, 15, 81, 240, 244, 115, 242, 143, 80, 80, 10,
        45, 78, 45, 82, 40, 44, 205, 47, 73, 84, 226, 229, 210, 130, 200, 163, 136, 42, 104, 4,
        135, 248, 7, 57, 186, 187, 42, 152, 155, 41, 24, 27, 152, 27, 25, 24, 104, 242, 114, 57,
        26, 24, 24, 24, 42, 248, 123, 43, 184, 167, 150, 128, 213, 21, 229, 231, 151, 40, 36, 231,
        231, 22, 228, 164, 150, 164, 166, 40, 104, 24, 232, 129, 20, 104, 43, 128, 104, 3, 133,
        226, 212, 228, 98, 77, 61, 94, 46, 0, 0, 0, 0, 255, 255,
    ];

    let mut inflate_stream = InflateState::new(DataFormat::Raw);
    let mut output = vec![0; 8];
    let result = inflate(&mut inflate_stream, &input, &mut output, MZFlush::None);

    assert!(result.status.is_ok());
    // Should not consume everything, there is not enough space in the buffer for the output.
    assert_ne!(result.bytes_consumed, input.len())
}

Output buffer is only 8 bytes in size, but miniz_oxide consumes the whole input buffer anyway. This causes the issue in flate2 that does not happen when using another deflate implementations such as zlib, so I think it is a miniz_oxide bug: https://github.com/rust-lang/flate2-rs/issues/434

link2xt commented 5 days ago

This underlying decompress function is probably fine, it takes out and should make sure not to overflow it: https://github.com/Frommi/miniz_oxide/blob/0a33effd414711b379e01b0613ba5ae85a0e14d0/miniz_oxide/src/inflate/core.rs#L1172-L1178

But inflate_loop here passes state.dict into decompress as the output buffer: https://github.com/Frommi/miniz_oxide/blob/0a33effd414711b379e01b0613ba5ae85a0e14d0/miniz_oxide/src/inflate/stream.rs#L308

And this output buffer size is 32768 (TINFL_FZ_DICT_SIZE).

link2xt commented 4 days ago

Maybe there is no bug, I made decompression work with flate2 using miniz_oxide backend: https://github.com/rust-lang/flate2-rs/pull/436 Seems the input just got consumed into InflateState and on the next call it gets out. This does not happen with zlib, but still usable.

link2xt commented 3 days ago

@oyvindln I made async-compression (which in turn uses flate2) to work: https://github.com/Nullus157/async-compression/pull/294

But this requires trying to inflate even when we have no input anymore to check if we can extract something from the internal state of the decoder. If it is possible to change miniz_oxide to behave like zlib and not consume more input than actually used that would be great as it will make miniz_oxide more difficult to misuse. Otherwise I think this "footgun" should at least be documented in miniz_oxide and flate2 documentation, current flate2 documentation is misleading and this led to bug in async-compression.

phord commented 3 days ago

I believe it is possible to have all of your input "consumed" yet still have remaining bytes to retrieve via zlib, too. It may be only a byte or two. But if you're not checking the state variables for "done". you may still be leaving pieces behind.

oyvindln commented 10 hours ago

Hm, would have to investigate that as well then