DanBloomberg / leptonica

Leptonica is an open source library containing software that is broadly useful for image processing and image analysis applications. The official github repository for Leptonica is: danbloomberg/leptonica. See leptonica.org for more documentation.
Other
1.72k stars 384 forks source link

JPEG decompression is not unique, which can cause colorcontent_reg to fail on the number of colors in wyom.jpg #719

Closed johnsonea closed 7 months ago

johnsonea commented 8 months ago

prog/colorcontent_reg gives an error at index 4 that the number of colors in the image wyom.jpg is off by 5140 (it finds 137,305 colors but is expecting 132,165).

JPEG decompression algorithms are NOT unique and can depend on round-off in the computation of the inverse discrete cosine transform (example source). For example, opening the wyom.jpg file used in colorcontent_reg's test using Photoshop 2024 and MATLAB 2023b find RGB(121,162,228) for the top left pixel, but ImageMagick and libjpeg and leptonica's pixRead find RGB(122,161,228) for the same pixel. Using the values from Photoshop & MATLAB, the number of colors is indeed 132,165 -- which is the value that colorcontent_reg seeks in its test; using the values from ImageMagick and libjpeg and leptonica's pixRead` finds 137,305 colors.

==> either do not use a JPEG image for this test, or modify the test to account for some small differences in JPEG decompression.

As a temporary fix (which may not work with all JPEG decompression algorithms) to succeed on make check, I changed using the following patch:

--- prog/colorcontent_reg.c.orig    2023-10-31 23:51:21
+++ prog/colorcontent_reg.c 2023-10-31 23:58:07
@@ -82,6 +82,7 @@
         /* Do a simple color quantization with sigbits = 3 */
     pix1 = pixRead("wyom.jpg");
     pixNumColors(pix1, 1, &ncolors);  /* >255, so should give 0 */
+    if (ncolors == 137305) regTestCompareValues(rp, ncolors, 137305, 0.0); else /* EAJ */
     regTestCompareValues(rp, ncolors, 132165, 0.0);  /* 4 */
     pix2 = pixSimpleColorQuantize(pix1, 3, 3, 20);
     pixDisplayWithTitle(pix2, 1000, 0, NULL, rp->display);
DanBloomberg commented 8 months ago

Thank you for reporting this. I will fix by allowing a large enough difference in the 4th arg in regTestCompareValues()