gendx / lzma-rs

An LZMA decoder written in pure Rust
MIT License
128 stars 27 forks source link

"Unexpected data after last XZ block" #112

Open Ziris85 opened 3 months ago

Ziris85 commented 3 months ago

Hello there! I've been running into this error quite a bit with some archives that my program interacts with, and I guess I'm not quite sure what to do about it? Here is an example file that seems to trigger it, and with the below code snippet, I get the "Unexpected data" message printed:

    let mut f = BufReader::new(File::open("Packages.xz").unwrap());
    let mut decomp: Vec<u8> = Vec::new();

    match lzma_rs::xz_decompress(&mut f, &mut decomp) {
        Ok(_) => {
            println!("Everything was fine!");
            true
        },
        Err(e) => if e.to_string().contains("Unexpected data after last XZ block") {
            println!("Unexpected data after last XZ block!");
            true
        } else {
            println!("Some other error! {}", e.to_string());
            false
        }
    };

If I just choose to ignore the message (as I have been doing thus far) the decompress seems to otherwise be fine, the results fully readable, and life goes on. But I feel a little dirty about just throwing this message away, especially if it could actually be indicating an issue with the archive. If I use other utilities, however, they don't seem to have issues with it. For example:

user@host: ~ $ xz --test Packages.xz
user@host: ~ $ echo $?
0

Digging a little bit more into this, I find that the headers do NOT include the sizes:

user@host: ~ $ xz --list --verbose --verbose Packages.xz
Packages.xz (1/1)
  Streams:           2
  Blocks:            1
  Compressed size:   2,328.1 KiB (2,384,012 B)
  Uncompressed size: 13.6 MiB (14,263,215 B)
  Ratio:             0.167
  Check:             CRC64
  Stream Padding:    0 B
  Streams:
    Stream    Blocks      CompOffset    UncompOffset        CompSize      UncompSize  Ratio  Check      Padding
         1         1               0               0       2,383,980      14,263,215  0.167  CRC64            0
         2         0       2,383,980      14,263,215              32               0    ---  CRC64            0
  Blocks:
    Stream     Block      CompOffset    UncompOffset       TotalSize      UncompSize  Ratio  Check      CheckVal          Header  Flags        CompSize    MemUsage  Filters
         1         1              12               0       2,383,940      14,263,215  0.167  CRC64      6a89d5904624a036      12  --          2,383,919       9 MiB  --lzma2=dict=8MiB
  Memory needed:     9 MiB
  Sizes in headers:  No
  Minimum XZ Utils version: 5.0.0

So, with that in mind I tried a different route and provided the uncompressed size to the process directly:

    let myopts = Options {
        unpacked_size: UnpackedSize::UseProvided(Some(14263215)),
        memlimit: None,
        allow_incomplete: false,
    };
    match lzma_rs::lzma_decompress_with_options(&mut f, &mut decomp, &myopts) {
        Ok(_) => {
            println!("Everything was fine!");
            true
        },
        Err(e) => if e.to_string().contains("Unexpected data after last XZ block") {
            println!("Unexpected data after last XZ block!");
            true
        } else {
            println!("Some other error! {}", e.to_string());
            false
        }
    };

THIS results in the "Some other error" arm:

Some other error! lzma error: LZMA header invalid properties: 253 must be < 225

I feel like I'm on the right track here, but I'm just not sure what I'm missing to make this all "happy", so any suggestions/pointers/tips would be greatly appreciated! Thank you!

Ziris85 commented 3 months ago

I just tried with the lzma-rust crate, and that seems to work just fine with the archive. Maybe I'm running into the "subset of the .xz file format" limitation that is in the readme on this crate? Happy to be wrong on that, of course.