victorvde / jpeg2png

silky smooth JPEG decoding
GNU General Public License v3.0
468 stars 26 forks source link

invalid coefficient sizes #7

Open sandsmark opened 4 years ago

sandsmark commented 4 years ago

not sure if you're still working on this, but the coefficient size check fails on some files (obviously, I guess, otherwise you wouldn't have put it there). The problem is that those are valid files.

I assume this is because of some edge case with jpeg padding, but I don't know enough about JPEG to say how this should be solved properly in the code.

Example file (click the "hidden pixels" to the left to see the "uncropped"): http://fotoforensics.com/analysis.php?id=44446d027a2254b524b49b94697e186122ba9cda.27709

And FWIW, here's where I learned about jpeg padding (click on "JPEG Padding" in the header); https://fotoforensics.com/tutorial-hidden-pixels.php

sandsmark commented 4 years ago

testet a couple of images on fotoforensics, and with an image with hidden padding of 7x4 the width coefficient is reported as bad by jpeg2png, and 4x7 has invalid height coefficient. one that works has e. g. 3x3 hidden pixels.

just doing some algebra, this patch seems to at least not break completely;

diff --git jpeg.c jpeg.c
  index bdb4246..4f7eb60 100644
  --- jpeg.c
  +++ jpeg.c
  @@ -57,10 +57,10 @@ void read_jpeg(FILE *in, struct jpeg *jpeg) {
                   coef->w_samp = d.max_h_samp_factor / i->h_samp_factor;
                   coef->h_samp = d.max_v_samp_factor / i->v_samp_factor;
                   if(coef->h / 8 != (jpeg->h / coef->h_samp + 7) / 8) {
  -                        die("jpeg invalid coef h size");
  +                        jpeg->h = coef->h_samp * (coef->h - 7);
                   }
                   if(coef->w / 8 != (jpeg->w / coef->w_samp + 7) / 8) {
  -                        die("jpeg invalid coef w size");
  +                        jpeg->w = coef->w_samp * (coef->w - 7);
                   }
                   if(SIZE_MAX / coef->h / coef->w / coef->h_samp / coef->w_samp < 6) {
                           die("jpeg is too big to fit in memory");
zvezdochiot commented 4 years ago

@sandsmark say:

The problem is that those are valid files.

593 columns are not provided with dct coefficients. Attempting to process it will result in a segment fault. There is a workaround:

$ jpegtran -crop 592x332+0+0 -outfile 44446d027a2254b524b49b94697e186122ba9cda.27709.jt.jpg 44446d027a2254b524b49b94697e186122ba9cda.27709.jpg 

$ jpeg2png -o 44446d027a2254b524b49b94697e186122ba9cda.27709.jpg.png 44446d027a2254b524b49b94697e186122ba9cda.27709.jt.jpg
zvezdochiot commented 4 years ago

@sandsmark say:

this patch

$ identify *.jpg *.png 
44446d027a2254b524b49b94697e186122ba9cda.27709.jpg JPEG 593x332 593x332+0+0 8-bit DirectClass 27.7KB 0.000u 0:00.000
44446d027a2254b524b49b94697e186122ba9cda.27709.jpg.png[1] PNG 594x332 594x332+0+0 8-bit DirectClass 195KB 0.000u 0:00.000

You can remove the w and h checks from the code. But this is fraught with consequences.

diff --git jpeg.c jpeg.c
  --- jpeg.c
  +++ jpeg.c
  @@ -57,10 +57,10 @@ void read_jpeg(FILE *in, struct jpeg *jpeg) {
                   coef->w_samp = d.max_h_samp_factor / i->h_samp_factor;
                   coef->h_samp = d.max_v_samp_factor / i->v_samp_factor;
  -               if(coef->h / 8 != (jpeg->h / coef->h_samp + 7) / 8) {
  -                        die("jpeg invalid coef h size");
  -                }
  -                if(coef->w / 8 != (jpeg->w / coef->w_samp + 7) / 8) {
  -                        die("jpeg invalid coef w size");
  -                }
                   if(SIZE_MAX / coef->h / coef->w / coef->h_samp / coef->w_samp < 6) {
                           die("jpeg is too big to fit in memory");
sandsmark commented 4 years ago

593 columns are not provided with dct coefficients. Attempting to process it will result in a segment fault. There is a workaround:

I thought they had coefficients because the data is stored in a 8x8 grid, the extra pixels are just not rendered. And neither ubsan, asan nor valgrind complained with my patch, so I assumed it was safe (aka. no segfaults). :-)

But I don't know anything about JPEG.

You can remove the w and h checks from the code. But this is fraught with consequences.

Well, it works (creates proper results and no crashes), I was just trying to understand what could break.