Open axxel opened 1 year ago
I have tried the JPEG decoder from Go's stdlib on images produced by this camera but it errors out with "invalid JPEG format: uninitialized Huffman table". I'm unsure if that's a hint to the underlying problem, or just a symptom.
Digging around, I've found this comment:
So I've been making myself smart and it seems that mjpeg stream frames do not have Huffman Table information (DHT). This was taken out and standardized for bandwidth optimization. The standardized DHT segment (where the starting ffc4 is the DHT marker) needs to be inserted before the 0xFFDA marker in order to be a universally valid JPEG:
It's not clear to me if the Huffman table is only optional in MJPEG or if it's always missing (i.e., a distinguishing feature from JPEG images). If the former, it may be relevant to this issue (e.g., if this specific camera doesn't include the table); if the latter, then it's likely not relevant (because stb_image
can decode MJPEG images of other cameras).
Interesting. I did a little research and according to Wikipedia and the linked PDF (pages 95ff) there are MJPEG-A and MJPEG-B variants. Looking at the file with a hex-editor and comparing that with https://www.disktuna.com/list-of-jpeg-markers/ revealed that this is definitively a MJPEG-A file. And I can confirm that it does not contain a 0xFFC4
(huffman table) marker. The above PDF combined with some stackoverflow article made me believe that JPEG files need to contain a Huffman table, while for MJPEG-A files they are optional.
I also came to the conclusion that my jhead
binary does not know about MJPEG
and the warning message Header missing JFIF marker
is bogous, because the APP0
segment can also contain AVI1
data instead of JFIF
data.
With the comment you linked to above, it seems relatively straight forward to extend stb_image
such that it can decode this image. But who is going to do it? ;)
IIUC the intent of the workaround (that I've tried but didn't work) is to take <<HEADER..., SOS, DATA...>>
and recreate the image as <<HEADER..., DHT, TABLE..., SOS, DATA...>>
. That's easy enough (in Go, a bit painful in C but doable).
However, already assuming this will be performed only on images without the table, what are the chances of it working (or not) without parsing the headers? Is there any guarantee the SOS byte sequence doesn't occur in the headers?
Here is the minimally invasive patch that I could come up with which adds support for reading MJPEG frames (lacking any DHT marker). Minimally invasive means smallest patch size and no ABI change. If that is not a requirement, I would probably add a mjpeg
flag next to the jfif
one and parse the APP0
marker accordingly and maybe even add a has_dht
flag and then check for j->mjpeg && !j->has_dht
instead of j->huff_dc[0].maxcode[17] == 0 && j->huff_ac[0].maxcode[17] == 0
(see below).
--- stb_image_2.28.h 2023-10-13 10:38:32.784216407 +0200
+++ stb_image.h.new 2023-10-13 10:40:24.804556441 +0200
@@ -3406,6 +3406,35 @@
return STBI__MARKER_none;
}
+static const stbi_uc stbi__mjpeg_dht_seg[0x1A4] = {
+ /* JPEG DHT Segment for YCrCb omitted from MJPG data */
+ 0xFF,0xC4,0x01,0xA2,
+ 0x00,0x00,0x01,0x05,0x01,0x01,0x01,0x01,0x01,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
+ 0x01,0x02,0x03,0x04,0x05,0x06,0x07,0x08,0x09,0x0A,0x0B,0x01,0x00,0x03,0x01,0x01,0x01,0x01,
+ 0x01,0x01,0x01,0x01,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x01,0x02,0x03,0x04,0x05,0x06,0x07,
+ 0x08,0x09,0x0A,0x0B,0x10,0x00,0x02,0x01,0x03,0x03,0x02,0x04,0x03,0x05,0x05,0x04,0x04,0x00,
+ 0x00,0x01,0x7D,0x01,0x02,0x03,0x00,0x04,0x11,0x05,0x12,0x21,0x31,0x41,0x06,0x13,0x51,0x61,
+ 0x07,0x22,0x71,0x14,0x32,0x81,0x91,0xA1,0x08,0x23,0x42,0xB1,0xC1,0x15,0x52,0xD1,0xF0,0x24,
+ 0x33,0x62,0x72,0x82,0x09,0x0A,0x16,0x17,0x18,0x19,0x1A,0x25,0x26,0x27,0x28,0x29,0x2A,0x34,
+ 0x35,0x36,0x37,0x38,0x39,0x3A,0x43,0x44,0x45,0x46,0x47,0x48,0x49,0x4A,0x53,0x54,0x55,0x56,
+ 0x57,0x58,0x59,0x5A,0x63,0x64,0x65,0x66,0x67,0x68,0x69,0x6A,0x73,0x74,0x75,0x76,0x77,0x78,
+ 0x79,0x7A,0x83,0x84,0x85,0x86,0x87,0x88,0x89,0x8A,0x92,0x93,0x94,0x95,0x96,0x97,0x98,0x99,
+ 0x9A,0xA2,0xA3,0xA4,0xA5,0xA6,0xA7,0xA8,0xA9,0xAA,0xB2,0xB3,0xB4,0xB5,0xB6,0xB7,0xB8,0xB9,
+ 0xBA,0xC2,0xC3,0xC4,0xC5,0xC6,0xC7,0xC8,0xC9,0xCA,0xD2,0xD3,0xD4,0xD5,0xD6,0xD7,0xD8,0xD9,
+ 0xDA,0xE1,0xE2,0xE3,0xE4,0xE5,0xE6,0xE7,0xE8,0xE9,0xEA,0xF1,0xF2,0xF3,0xF4,0xF5,0xF6,0xF7,
+ 0xF8,0xF9,0xFA,0x11,0x00,0x02,0x01,0x02,0x04,0x04,0x03,0x04,0x07,0x05,0x04,0x04,0x00,0x01,
+ 0x02,0x77,0x00,0x01,0x02,0x03,0x11,0x04,0x05,0x21,0x31,0x06,0x12,0x41,0x51,0x07,0x61,0x71,
+ 0x13,0x22,0x32,0x81,0x08,0x14,0x42,0x91,0xA1,0xB1,0xC1,0x09,0x23,0x33,0x52,0xF0,0x15,0x62,
+ 0x72,0xD1,0x0A,0x16,0x24,0x34,0xE1,0x25,0xF1,0x17,0x18,0x19,0x1A,0x26,0x27,0x28,0x29,0x2A,
+ 0x35,0x36,0x37,0x38,0x39,0x3A,0x43,0x44,0x45,0x46,0x47,0x48,0x49,0x4A,0x53,0x54,0x55,0x56,
+ 0x57,0x58,0x59,0x5A,0x63,0x64,0x65,0x66,0x67,0x68,0x69,0x6A,0x73,0x74,0x75,0x76,0x77,0x78,
+ 0x79,0x7A,0x82,0x83,0x84,0x85,0x86,0x87,0x88,0x89,0x8A,0x92,0x93,0x94,0x95,0x96,0x97,0x98,
+ 0x99,0x9A,0xA2,0xA3,0xA4,0xA5,0xA6,0xA7,0xA8,0xA9,0xAA,0xB2,0xB3,0xB4,0xB5,0xB6,0xB7,0xB8,
+ 0xB9,0xBA,0xC2,0xC3,0xC4,0xC5,0xC6,0xC7,0xC8,0xC9,0xCA,0xD2,0xD3,0xD4,0xD5,0xD6,0xD7,0xD8,
+ 0xD9,0xDA,0xE2,0xE3,0xE4,0xE5,0xE6,0xE7,0xE8,0xE9,0xEA,0xF2,0xF3,0xF4,0xF5,0xF6,0xF7,0xF8,
+ 0xF9,0xFA
+};
+
// decode image to YCbCr format
static int stbi__decode_jpeg_image(stbi__jpeg *j)
{
@@ -3419,6 +3448,15 @@
m = stbi__get_marker(j);
while (!stbi__EOI(m)) {
if (stbi__SOS(m)) {
+ // check if we have an initialized huffman table (processed a DHT marker ealier)
+ if (j->huff_dc[0].maxcode[17] == 0 && j->huff_ac[0].maxcode[17] == 0) {
+ // if not, then assume we are reading a MJPEG frame and parse the DHT from the spec
+ stbi__context s, *ori_s = j->s;
+ stbi__start_mem(&s, stbi__mjpeg_dht_seg + 2, sizeof(stbi__mjpeg_dht_seg) - 2);
+ j->s = &s;
+ stbi__process_marker(j, 0xC4);
+ j->s = ori_s;
+ }
if (!stbi__process_scan_header(j)) return 0;
if (!stbi__parse_entropy_coded_data(j)) return 0;
if (j->marker == STBI__MARKER_none ) {
I'd be happy to prepare a PR but looking at the number of open PRs and issues I'll wait for a hint from the maintainer that this is actually of interest.
@siiky Was the above patch of help for your issue?
Hey @axxel, sorry I forgot to reply. Thanks for the patch, it seems to work well, from my tests!
EDIT: The following initial description of the issue is outdated. There is nothing wrong with the JFIF header (there is none at all) nor are there any unknown markers contained. The problematic image is a standard MJPEG image and reads just fine with the provided patch below.
The attached image is 'semi corrupt', meaning something is off with the JFIF markers according to
jhead
:That said, the image reads just fine e.g. with
djpeg
,eog
, the Qt libs and presumably your browser:Version 2.27 failed to read that image with the error message "unknown marker". Version 2.28 reads the image without complaints but contains garbage (maybe uninitialized) data. The relevant change causing this 'regression' seems to be this (see my comment there).
Expected behavior Ideally, this image could be read properly. If this is too much to ask, the second best option would be to go back to the behavior of 2.27 where it obviously instead of silently fails to read the image.