image-rs / jpeg-decoder

JPEG decoder written in Rust
Apache License 2.0
148 stars 88 forks source link

Failed to load image: failed to fill whole buffer #251

Open micahsnyder opened 2 years ago

micahsnyder commented 2 years ago

I have a JPG that fails to load with image-rs with the error message: Failed to load image: failed to fill whole buffer 1901125955-failed-to-fill-whole-buffer

quilan1 commented 2 years ago

I have to admit, I wasn't expecting PDF text inside the no-man's-land of the end of a JPEG file, but there it is.

Beginning roughly @ file offset 0x88A4 (or 5? or 3?), the file has a bunch of PDF text information, probably from a grocery store flier of some sort. I think (?) this occurs mid SOS segment, but since it's mostly text, the decoder reads in a bunch of garbage without too much problem. When it finishes the segment, we're brought back to the main decode_internal loop and it calls self.read_marker and looks for the next marker. In this scenario, there is no final EOI marker (or any other marker in general), it raises an error as it reaches the end of the stream without finding an 0xFF.

The 'fix' here is easy, if one wanted to clone the repo with this functionality, but because it swallows the error and simply bails, it's quite bad practice.

// Inside decoder.rs, decode_internal(..):

let marker = match pending_marker.take() {
    Some(m) => m,
    None => match self.read_marker() {
        Err(_) => break,  // Swallow the error and exit the loop, because we've finished the stream
        Ok(v) => v,
    },
};

Until, however, some API is designed for optionally allowing a more forgiving approach to decoding images, I think this is where it'll stay, but I'd love some input from the maintainers though.

HeroicKatora commented 2 years ago

Maintainer here: First of all, thanks a lot for the detailed analysis over multiple issues!

Secondly, regarding the error interface it seems more of a matter of implementing it. If it's a non-breaking change then it'll be a no-brainer easy to accept when it's reasonably small, and any other can be done as 0.3—that's a considerable addition. With regards to design, my personal opinion is that error recovery should be opt-in: their exact semantics are not standardized anywhere and it shouldn't become a burden of discussion if they are every adjusted. That also means detection of recovery should generally be as specific as possible.

We can always add a Default impl for recovery options and expand it later if we feel like (or have enough data to show for it) that some behavior is already stable enough and unlikely to change.