AcademySoftwareFoundation / openexr

The OpenEXR project provides the specification and reference implementation of the EXR file format, the professional-grade image storage format of the motion picture industry.
http://www.openexr.com/
BSD 3-Clause "New" or "Revised" License
1.63k stars 619 forks source link

DwaCompressor endianess issue #222

Open wljjn opened 7 years ago

wljjn commented 7 years ago

I got a test failure on sparc in testRgba.

writing reading channels RGBA, line order 0, compression 8 writing reading Assertion failed: relError < .1, file /builds/jinji/openexr-upgrade/components/openexr/openexr-2.2.0/IlmImfTest/compareDwa.cpp, line 122 /bin/bash: line 5: 303067 Abort (core dumped) ${dir}$tst FAIL: IlmImfTest 1 of 1 test failed make[3]: [Makefile:552: check-TESTS] Error 1 make[3]: Leaving directory '/builds/jinji/openexr-upgrade/components/openexr/build/sparcv7/IlmImfTest' make[2]: [Makefile:675: check-am] Error 2 make[2]: Leaving directory '/builds/jinji/openexr-upgrade/components/openexr/build/sparcv7/IlmImfTest' make[1]: *** [Makefile:382: check-recursive] Error 1 make[1]: Leaving directory '/builds/jinji/openexr-upgrade/components/openexr/build/sparcv7'

After checking output step by step, I think it is endianess issue in DwaCompressor, although there seems to be some codes in ImfDwaCompressor.cpp dealing with endianess. I tried to pass this test by tweaking the endianess around. I wonder has it been tested in a big endian machine? Or do I miss anything?


--- IlmImf/ImfDwaCompressor.cpp 2016-12-22 15:20:40.220763763 +0000
+++ IlmImf/ImfDwaCompressor.cpp 2017-01-10 09:47:16.415207575 +0000
@@ -824,15 +824,15 @@

                 if (!_isNativeXdr)
                 {
-                    for (int i = 0; i < 64; ++i)
-                    {
-                        tmpShortXdr      = halfZigBlock[comp]._buffer[i];
+                    //for (int i = 0; i < 64; ++i)
+                    //{
+                        tmpShortXdr      = halfZigBlock[comp]._buffer[0];
                         tmpConstCharPtr  = (const char *)&tmpShortXdr;

                         Xdr::read<CharPtrIO> (tmpConstCharPtr, tmpShortNative);

-                        halfZigBlock[comp]._buffer[i] = tmpShortNative;
-                    }
+                        halfZigBlock[comp]._buffer[0] = tmpShortNative;
+                    //}
                 }

                 if (lastNonZero == 0)
@@ -1131,6 +1131,20 @@
         } // comp
     } // blocky

+    if (!_isNativeXdr) {
+        for (unsigned int chan = 0; chan < numComp; ++chan)
+        {
+            for (int y=0; y<_height; ++y)
+            {
+                for (int x=0; x<_width; ++x)
+                {
+                    char c = _rowPtrs[chan][y][2*x];
+                    _rowPtrs[chan][y][2*x] =_rowPtrs[chan][y][2*x+1];
+                    _rowPtrs[chan][y][2*x+1] = c;
+                }
+            }
+        }
+    }
     //
     // Walk over all the channels that are of type FLOAT.
     // Convert from HALF XDR back to FLOAT XDR.
@@ -1470,9 +1484,12 @@
                             vy = _height - (vy - (_height - 1));

                         if (vy < 0) vy = _height-1;
-
+
                         tmpShortXdr =
                             ((const unsigned short *)(_rowPtrs[chan])[vy])[vx];
+                       if (!GLOBAL_SYSTEM_LITTLE_ENDIAN) {
+                           tmpShortXdr = ((tmpShortXdr << 8) & 0xff00) | ((tmpShortXdr >> 8) & 0xff);
+                       }

                         if (_toNonlinear)
                         {
@@ -1546,6 +1563,9 @@
                 {
                     tmpCharPtr = (char *)&tmpShortXdr;
                     Xdr::write<CharPtrIO>(tmpCharPtr, halfZigCoef[i].bits());
+                   if (!GLOBAL_SYSTEM_LITTLE_ENDIAN) {
+                        tmpShortXdr = ((tmpShortXdr << 8) & 0xff00) | ((tmpShortXdr >> 8) & 0xff);
+                    }
                     halfZigCoef[i].setBits(tmpShortXdr);
                 }

@@ -1553,9 +1573,12 @@
                 // Save the DC component separately, to be compressed on
                 // its own.
                 //
-
-                *currDcComp[chan]++ = halfZigCoef[0].bits();
-                _numDcComp++;
+               if (!GLOBAL_SYSTEM_LITTLE_ENDIAN) {
+                    *currDcComp[chan]++ = (halfZigCoef[0].bits() << 8 & 0xff00) | (halfZigCoef[0].bits() >> 8 & 0xff);
+                } else {
+                   *currDcComp[chan]++ = halfZigCoef[0].bits();
+               }
+               _numDcComp++;

                 //
                 // Then RLE the AC components (which will record the count
glaubitz commented 6 years ago

We have the same test failure in Debian, but it does not affect the other big-endian architectures.

I'm more to inclined to believe that the bug is an alignment issue:

libtool: link: g++ -pipe -g -O2 -fdebug-prefix-map=/<<PKGBUILDDIR>>=. -fstack-protector-strong -Wformat -Werror=format-security -Wl,-z -Wl,relro -o .libs/IlmImfTest main.o testAttributes.o testChannels.o testCompression.o testCopyPixels.o testCustomAttributes.o testHuf.o testLineOrder.o testMalformedImages.o testLut.o testRgba.o testRgbaThreading.o testSampleImages.o testSharedFrameBuffer.o testWav.o testXdr.o testConversion.o testNativeFormat.o testPreviewImage.o testMagic.o testStandardAttributes.o testExistingStreams.o testScanLineApi.o testTiledCompression.o testTiledCopyPixels.o testTiledLineOrder.o testTiledRgba.o compareFloat.o testTiledYa.o testYca.o compareB44.o testMultiView.o testIsComplete.o testMultiPartApi.o testMultiPartThreading.o testMultiScanlinePartThreading.o testMultiTiledPartThreading.o testDeepScanLineBasic.o testDeepTiledBasic.o testMultiPartFileMixingBasic.o testMultiPartSharedAttributes.o testBackwardCompatibility.o testCopyDeepScanLine.o testCopyDeepTiled.o testCopyMultiPartFile.o testCompositeDeepScanLine.o testInputPart.o testDeepScanLineMultipleRead.o testPartHelper.o testOptimized.o testBadTypeAttributes.o testFutureProofing.o compareDwa.o testDwaCompressorSimd.o testRle.o -pthread  -L../IlmImf -L/usr/lib -lImath -lHalf -lIex -lIlmThread /<<PKGBUILDDIR>>/IlmImf/.libs/libIlmImf.so -lz -lpthread -pthread
make[4]: Leaving directory '/<<PKGBUILDDIR>>/IlmImfTest'
make  check-TESTS
make[4]: Entering directory '/<<PKGBUILDDIR>>/IlmImfTest'
make[5]: Entering directory '/<<PKGBUILDDIR>>/IlmImfTest'
../test-driver: line 107: 63055 Bus error               "$@" > $log_file 2>&1
FAIL: IlmImfTest
glaubitz commented 6 years ago

@wljjn Do you have any suggestions on how to run the test in question individually so it can be debugged with gdb? So far, I have been unable to achieve that.

glaubitz commented 6 years ago

@fkainz @pstanczyk Can you give any hints on how to run selected tests to be able to debug this problem?

peterhillman commented 6 years ago

You can run IlmImfTest/IlmImfTest directly (rather than via a make build target) using the test name as an argument, to run a single test

e.g.

IlmImfTest testDwaCompressorSimd

You can also run a subset of tests by using an argument such as core, basic, deep, multi. Check the source of "main.cpp" in IlmImfTest to see the test names.

If you build via the ./configure script you may need to use libtool to run the debugger with IlmImfTest. For debuggers with a GUI you may have more luck building so libtool isn't required - the internet will tell you how in either case.

DerDakon commented 5 years ago

I hit the same issue on my Sparc T5120 running Gentoo Linux with version 2.80, see Gentoo bug 656680.

DerDakon commented 4 years ago

Still happens with 2.5.2 on sparc:

writing reading channels A, line order 0, compression 7
writing reading channels RB, line order 0, compression 7
writing reading channels RGBA, line order 0, compression 8
writing reading IlmImfTest: /var/tmp/portage/media-libs/openexr-2.5.2/work/openexr-2.5.2/OpenEXR/IlmImfTest/compareDwa.cpp:124: void compareDwa(int, int, const Imf_2_5::Array2D<Imf_2_5::Rgba>&, const Imf_2_5::Array2D<Imf_2_5::Rgba>&, Imf_2_5::RgbaChannels): Assertion `relError < .1' failed.
DerDakon commented 3 years ago

Same error with 2.5.4

waebbl commented 3 years ago

On Gentoo, we are currently hitting the same error on ppc64, another big endian machine, with 2.5.7. See https://bugs.gentoo.org/818424