Bodmer / JPEGDecoder

A JPEG decoder library
Other
220 stars 64 forks source link

Detect invalid JPEGs #39

Closed falkoschindler closed 5 years ago

falkoschindler commented 5 years ago

I'm trying to use JPEGDecoder to detect corrupted JPEG data. How could I do that?

My scenario: After capturing an image with an OV2640 camera and transmitting it via HTTP, I often get errors when decoding it with OpenCV. I already verified that it is caused by a hardware issue on the camera board. Is there a simple way to determine if all JPEG blocks are valid?

What I mean with invalid JPEG: Corrupted image data, e.g. caused by missing bytes or flipped bits, yields a decoded image with shifted blocks and weirdly colored areas. In the bottom right there are usually several blocks filled with constant color. OpenCV fills them with plain gray, JPEGDecoder fills them with the last correctly decoded color.

I already tried to analyze if the last block is filled with constant color. But in certain situations this can be the case for correctly decoded images as well. I also looked into the source code to find a place where those unicolored pixels are set, but couldn't find it. The status returned by JPEGDecoder is not informative in this case.

Thanks for any hint into the right direction!

falkoschindler commented 5 years ago

The issue can be reproduced by manipulating an image, e.g. setting 100 bytes in the middle of the jpeg data buffer to zero, and then running JPEGDecoder. The bottom half of the image is shifted to the left and the last blocks in the bottom right corner are filled with a random color. How to detect that reliably?

Bodmer commented 5 years ago

I assume you have a jpeg image source that is generating corrupted files for some reason. This seems to be a rather odd situation and a tricky one to handle as compressed images look like random numbers, thus have no patern to detect errors with. Perhaps you could check the last decoded mcu block and see if all pixels are the same colour.

The ideal solution is to change the camera to one that does not produce bad images!

falkoschindler commented 5 years ago

Yes, you are right: The camera is producing corrupt JPEGs. Unfortunately, we rely on a specific camera model, which seems to be produced in poor quality, such that only a certain amount of cameras works reliably. And we thought about using JPEGDecoder to detect "bad" cameras in some kind of self test directly on the device.

Since the decoder seems to "decide" to fill the last blocks with the same color and other decoders "decide" to fill them with 50% gray, I thought it must be able detect such a situation explicitly. After decoding all blocks, the EOF (end of image) marker must come earlier than expected. Of course, this is not clear from the raw, random jpeg data, but after decoding all bytes should be assigned to an MCU and missing bytes should be obvious.

I'm still digging into the code and might find a way to add this feature. I just thought you might have an idea right out of your mind. Thanks anyway for your quick reply!

falkoschindler commented 5 years ago

Just for your information: I think I solved the issue. :)

Comparing picojpeg with picojdec I noticed a check in getOctet(). Picojdec raises an exception if the byte after 0xFF is non-zero. In getOctet() this case is basically ignored. And since getChar() seems to return 0xD9 if no data is left, we get multiple 0xD9 (=end-of-image marker) when reading a corrupt JPEG. So I simply count these occurrences in a global variable and return an error if the counter is larger than one. This is probably not the cleanest and most stable solution, but getting the information out of picojpeg into my main program without too many changes to the library was tedious enough.