image-rs / jpeg-decoder

JPEG decoder written in Rust
Apache License 2.0
150 stars 87 forks source link

Corrupted JPEG does not result in visible error code #169

Open dave-andersen opened 3 years ago

dave-andersen commented 3 years ago

This image is corrupt and will not decode in most image decoders (convert; opencv imread), but does not produce a decoding error in jpeg-decoder. Whether this is a problem is debatable, of course - for an image viewer it's good to be able to display the partial, corrupted image - but it's also nice for a library to be able to report that there's a glitch.

839442a6aa3dcca42f4962e8ec87b999

A simple program to validate the lack of error:

use anyhow::{anyhow, Result};

fn doit() -> Result<()> {
    let f = "839442a6aa3dcca42f4962e8ec87b999.jpg";
    let of = std::fs::File::open(f)?;
    let mut decoder = jpeg_decoder::Decoder::new(std::io::BufReader::new(of));
    decoder.read_info()?;
    let md = decoder.info().unwrap();
    println!("Metadata: {:#?} h/w: {} {}", md, md.height, md.width);
    let pixels = decoder.decode()?;
    println!("Pixel count: {}", pixels.len());
    println!("Happy image by jpeg crate");
    Ok(())
}
fn main() {
    if let Err(e) = doit() {
        println!("Error reading jpeg: {}", e);
    }
}

Which, when run, produces:

/target/debug/jpegcheck
Metadata: ImageInfo {
    width: 2560,
    height: 1920,
    pixel_format: RGB24,
} h/w: 1920 2560
Pixel count: 14745600
Happy image by jpeg crate
HeroicKatora commented 3 years ago
$ djpeg -verbose /tmp/97358000-fc1b8d80-1870-11eb-959b-e4f8051c9502.jpg  > /dev/null 
libjpeg-turbo version 2.0.5 (build 20200830)
Copyright (C) 2009-2020 D. R. Commander
Copyright (C) 2011-2016 Siarhei Siamashka
Copyright (C) 2015-2016, 2018 Matthieu Darbois
Copyright (C) 2015 Intel Corporation
Copyright (C) 2015 Google, Inc.
Copyright (C) 2013-2014 MIPS Technologies, Inc.
Copyright (C) 2013 Linaro Limited
Copyright (C) 2009-2011 Nokia Corporation and/or its subsidiary(-ies)
Copyright (C) 2009 Pierre Ossman for Cendio AB
Copyright (C) 1999-2006 MIYASAKA Masaru
Copyright (C) 1991-2016 Thomas G. Lane, Guido Vollbeding

Emulating The Independent JPEG Group's software, version 8d  15-Jan-2012

Start of Image
JFIF APP0 marker: version 1.01, density 1x1  0
Define Quantization Table 0  precision 0
Define Quantization Table 1  precision 0
Define Quantization Table 2  precision 0
Start Of Frame 0xc0: width=2560, height=1920, components=3
    Component 1: 2hx2v q=0
    Component 2: 1hx1v q=1
    Component 3: 1hx1v q=2
Define Huffman Table 0x00
Define Huffman Table 0x10
Define Huffman Table 0x01
Define Huffman Table 0x11
Define Restart Interval 0
Start Of Scan: 3 components
    Component 1: dc=0 ac=0
    Component 2: dc=1 ac=1
    Component 3: dc=1 ac=1
  Ss=0, Se=63, Ah=0, Al=0
Corrupt JPEG data: premature end of data segment
Unexpected marker 0xd0
End Of Image
dave-andersen commented 3 years ago

Yup. Notably, the behavior for this image varies by library:

djpeg prints out a warning but will try to save a full-sized (with lots of grey) image. The python bindings for libjpegturbo throw an exception. Python PIL silently accepts it. Chrome will render it with lots of blank grey. The go standard library jpeg decoder returns an error trying to parse it ('invalid JPEG format: missing 0xff00 sequence'). rust-mozjpeg prints a warning to stderr that the programmer can't capture. :)

It'd be nice to be able to select those behaviors or get access to a warning so as to be able to choose whether to be strict (as one might for my usecase, validating that a good image was supplied), or lenient (as one might for a browser or image viewer).

kaj commented 3 years ago

One way forward might be to set a flag in the decoder before attempting to decode the file. If the flag is set, you get an error and no image, if it's unset, you get an image with gray areas and no indication of an error.

dave-andersen commented 3 years ago

That'd be a nice solution.

mainrs commented 3 years ago

One way forward might be to set a flag in the decoder before attempting to decode the file. If the flag is set, you get an error and no image, if it's unset, you get an image with gray areas and no indication of an error.

What about just returning an image with gray and an error type that can be filtered. Like Error::IncompleteImageData or something similar. That way people can do whatever they want without having to introduce a flag into the decoder.

Nevermind, I forgot how enums work smh. Just to throw it in: the error could in theory contain the partially decoded image.