Incorrect output from the RawImageDataU16::scaleValues SSE code path

pedrocr commented 9 years ago

While tracking down a difference in output between rawspeed's scaled output and darktable's own scaling I found out RawImageDataU16::scaleValues produces apparently different output between the sse and non-sse code paths. I need to debug this more but this seems to happen even without dithering enabled so it's not just a matter of having different random values.

pedrocr commented 9 years ago

You can see 16bit tiffs showing the differences here:

http://scratch.corujas.net/sse_diffs/

klauspost commented 9 years ago

RawImageDataU16::scaleValues produces apparently different output between the sse and non-sse code paths.

The algorithm for generating the random values isn't the same. For non-dithered, the rounding is also slightly different, so it will not match completely.

pedrocr commented 9 years ago

The comparison has dithering disabled and the differences look quite substantial.

pedrocr commented 9 years ago

In fact the non-dither non-sse code seems to match the new darktable (sse and non-sse) code just fine even though that one is implemented differently. So I suspect there's indeed a bug in the sse scaling code.

klauspost commented 9 years ago

ok, I will take a look. Thanks for reporting!

klauspost commented 9 years ago

I just want to make sure that when you say "quite substantial", is there a bias, or is it just "different"?

Isn't the differences just bigger because of response (base) curve and gamma scaling?

klauspost commented 9 years ago

Oh yeah. WB scaling is also applied, which is why red differences are so dominant.

pedrocr commented 9 years ago

I've disabled all that when generating these differences. There is no base curve and no WB. The only modules applied are the input profile (I used a linear one) and the output profile (sRGB but probably should also have put something linear). So the output should be pretty linear 16bit tiff and yet produces plenty of pixels with differences above 500. That doesn't sound right.

pedrocr commented 9 years ago

I did some more testing and our code also produces similar differences to your code. So maybe this is just a normal level of difference. It sounds quite high to me though.

pedrocr commented 9 years ago

Using a linear output as well the differences between our code and the rawspeed non-sse code are smaller and look mostly like a random splatter of red and blue pixels. I'm seeing 2607 pixels with at least one component above 500 but that's only 0.03% of the pixels so maybe that's ok.

The difference betwen the rawspeed sse code and our new code is much higher though. First the output shows a pattern that matches the image itself and second there are a lot more pixels with large differences (13275 with differences above 500 or 0.17%). But maybe that's still fine.

klauspost commented 9 years ago

Did some stats:

    r->scaleBlackWhite();
    r_ref->scaleBlackWhite();
    double delta = 0;
    int delta_max = 0;
    int delta_min = 0;
    for (int y = 0; y < r->dim.y; y++) {
      for (int x = 0; x < r->dim.x; x++) {
        int a = (int)*(ushort16*)r->getData(x,y);
        int b = (int)*(ushort16*)r_ref->getData(x,y);
        int d = a-b;
        delta += (double)d;
        if (d > delta_max)
          delta_max = d;
        if (d < delta_min)
          delta_min = d;
      }
    }
    double avg = delta / double(r->dim.x * r->dim.y);

There seem to be adjustment needed.

SSE has the correct range, but has a negative bias of 1/2th of the scale (so scale by a factor 16, bias is -8). This should be adjusted. Range of difference is -16 to 0. It should be -8 to +8. This lowers the blackpoint by 0.5, so there is nothing other than a theoretical difference.

Non-SSE has a dither range of -4 to +4 on a scale by 16, which is a too low by a factor of two, so dithering is a little to weak.

I found no outliers, but I will cross-test.

pedrocr commented 9 years ago

Note that I'm seeing these differences without dithering.

klauspost commented 9 years ago

ok, without dithering and when correcting the bias, there is a 0.01 of 65535 difference between SSE and non-SSE, with all values being between -2 and +1 when scaling with a scale factor of 16.

I will check things through again, and send in a new version that corrects the bias, which of course shouldn't be there, but doesn't have any influence on the final image.

pedrocr commented 9 years ago

Thanks for looking into this. Are you saying that what's being fixed would never be noticeable in the final image? I'm struggling to define what is a reasonable level of difference between two images for regression testing. Finding pixels that are (0,120,180) vs (59,120,180) would seem like a hard thing to ignore.

klauspost commented 9 years ago

Well, you cannot base it on an absolute value. A single digit difference could mean that the de-mosaic takes a different approact and therefor interpolates missing values in a different wat. A de-noise filter could take a different approach in a section.

The thing is that dithering actually helps because values that are measured to equal values don't always have the same value - it is close to a similar value, but it doesn't it appear to the code as if they are exactly the same, which we don't know in reality.

But I will go through everything and re-check it. Probably not today, unfortunately.

pedrocr commented 9 years ago

There is no denoise in these tests, but I have also wondered if the demosaic algo is doing something funny. The completely linear and without WB tests give differences of (19,77,50) vs (19,70,43) which are not ideal but seem somewhat reasonable. It's only when enabling back WB/matrix/curves that the results give very large differences.

This is more annoying for regression testing than it seems to be for visual impact. It's very hard to distinguish images with the large differences even when I know what to look for and look at images at more than 100% magnification...

pedrocr commented 9 years ago

Did you get any chance to look at this again? I've cooked up a darktable regression testing setup and it's showing up differences between most files and I think that at least in some cases it's because of this:

http://darktable.pedrocr.pt/

klauspost / rawspeed

Incorrect output from the RawImageDataU16::scaleValues SSE code path #108