sile / libflate

A Rust implementation of DEFLATE algorithm and related formats (ZLIB, GZIP)
https://docs.rs/libflate
MIT License
180 stars 35 forks source link

panic at ..(self.table[i], MAX_BITWIDTH as u16 + 1) on valid file #6

Closed FauxFaux closed 7 years ago

FauxFaux commented 7 years ago

The releases for the "joe" / "jupp" text editor, from mirbsd, cause the decompressor to panic (in debug mode) or error (in release mode).

This has happened on multiple releases, so I'm guessing it's a BSD gzip implementation quirk.

Neither GNU gzip or libarchive (bsdtar) gives the slightest hint that something might be wrong.

Download link: http://deb.debian.org/debian/pool/main/j/jupp/jupp_3.1.30.orig.tar.gz https://www.mirbsd.org/MirOS/dist/jupp/joe-3.1jupp30.tgz

// cat src/main.rs #  example from the README.md
extern crate libflate;

use std::io;
use libflate::gzip::Decoder;

fn main() {
    let mut input = io::stdin();
    let mut decoder = Decoder::new(&mut input).unwrap();
    io::copy(&mut decoder, &mut io::stdout()).unwrap();
}
% cargo run --release <joe-3.1jupp30.tgz
    Finished release [optimized] target(s) in 0.0 secs
     Running `target/release/flate-test`
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Error { repr: Custom(Custom { 
kind: InvalidData, error: StringError("Invalid huffman coded stream")
 }) }', /checkout/src/libcore/result.rs:860
note: Run with `RUST_BACKTRACE=1` for a backtrace.
% RUST_BACKTRACE=1 cargo run <joe-3.1jupp30.tgz
    Finished dev [unoptimized + debuginfo] target(s) in 0.0 secs
     Running `target/debug/flate-test`
thread 'main' panicked at 'assertion failed: `(left == right)` (left: `1`, right: `16`)', 
/home/faux/.cargo/registry/src/github.com-1ecc6299db9ec823/libflate-0.1.7/src/huffman.rs:98
stack backtrace:
...
   6: std::panicking::begin_panic_fmt
             at /checkout/src/libstd/panicking.rs:495
   7: <libflate::huffman::DecoderBuilder as libflate::huffman::Builder>::set_mapping
             at /home/faux/.cargo/registry/src/github.com-1ecc6299db9ec823/libflate-0.1.7/src/huffman.rs:98
   8: libflate::huffman::Builder::restore_canonical_huffman_codes
             at /home/faux/.cargo/registry/src/github.com-1ecc6299db9ec823/libflate-0.1.7/src/huffman.rs:54
   9: libflate::huffman::DecoderBuilder::from_bitwidthes
             at /home/faux/.cargo/registry/src/github.com-1ecc6299db9ec823/libflate-0.1.7/src/huffman.rs:80
  10: <libflate::deflate::symbol::DynamicHuffmanCodec as libflate::deflate::symbol::HuffmanCodec>::load
             at /home/faux/.cargo/registry/src/github.com-1ecc6299db9ec823/libflate-0.1.7/src/deflate/symbol.rs:330
  11: <libflate::deflate::decode::Decoder<R>>::read_compressed_block
             at /home/faux/.cargo/registry/src/github.com-1ecc6299db9ec823/libflate-0.1.7/src/deflate/decode.rs:92
  12: <libflate::deflate::decode::Decoder<R> as std::io::Read>::read
             at /home/faux/.cargo/registry/src/github.com-1ecc6299db9ec823/libflate-0.1.7/src/deflate/decode.rs:165
  13: <libflate::gzip::Decoder<R> as std::io::Read>::read
             at /home/faux/.cargo/registry/src/github.com-1ecc6299db9ec823/libflate-0.1.7/src/gzip.rs:859
  14: std::io::util::copy
             at /checkout/src/libstd/io/util.rs:53
  15: flate_test::main
             at src/main.rs:9
...
FauxFaux commented 7 years ago

The files are signed with gzsig, a deprecated tool which makes use of the "extra field" in the gzip header.

I made a simple example file, which fails in a more interesting way: read_exact goes wrong somewhere. https://b.goeswhere.com/world.gz https://b.goeswhere.com/world.signed.gz

I made these using printf hello | gzip -n > world.gz, and signing it using my ~/.ssh/id_rsa using a build of gzsig from an old commit in https://github.com/chneukirchen/outils . I have no idea if it's done anything useful, but the gzip file is readable by gzip/libarchive, and not by libflate.

The problem appears to be that there's a rogue {0x05, 0x01} between the header and the extra field:

% xxd world.signed.gz                   
00000000: 1f8b 0804 0000 0000 0003 0501 4753 0101  ............GS..

I do not know what that is.

FauxFaux commented 7 years ago

Aha! The extra field code is wrong.

The format is [LE len, 2 bytes] [id, 2 bytes] [LE len again, 2 bytes]. The code is trying to read just the second two.