watery01 / libyuv

Automatically exported from code.google.com/p/libyuv
0 stars 0 forks source link

fix ARGBToRGB565 for big-endian #171

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
ARGBToRGB565Row_C() is missing implementation for big endian, see the attached 
patches for a fix - first patch moves READWORD and WRITEWORD macros/functions 
to a commonly used header (basic_types.h) and the second one uses WRITEWORD in 
ARGBToRGB565Row_C()

The test suite now passes on both little-endian (x86_64) and big-endian (s390x) 
machines.

Original issue reported on code.google.com by d...@danny.cz on 30 Dec 2012 at 11:26

Attachments:

GoogleCodeExporter commented 9 years ago
Thanks for the patch and testing!
Patch itself looks okay, except I prefer keep WRITEWORD etc internal.
This is the review
https://webrtc-codereview.appspot.com/1022006/

The unittests will be weak on s390x, as they mostly test C vs assembly match.
If the C code has endian bugs, it won't catch them.
Hmm... Perhaps I should make the unittests do a checksum on the C code results, 
ensuring bit exactness on all platforms for the C version.

I don't have any big endian machines to test on.  As is, I interpreted the 
fourcc's as meaning a well defined memory layout.  Your change is consistent 
with that.  Does s390x have v4l or some other way to confirm the fourcc's match 
the code?

Original comment by fbarch...@chromium.org on 5 Jan 2013 at 8:37

GoogleCodeExporter commented 9 years ago
Fixed in r524

Original comment by fbarch...@chromium.org on 7 Jan 2013 at 6:52

GoogleCodeExporter commented 9 years ago
s390x doesn't have any graphical capabilities, ppc (Apple G4, G5) could be used 
instead, if that's what you meant with the question about v4l

Original comment by d...@danny.cz on 7 Jan 2013 at 8:57

GoogleCodeExporter commented 9 years ago
I'll refine this CL a bit
1. move to private header
https://webrtc-codereview.appspot.com/1025004
2. move code that uses it to row_common.cc
3. change name to be more specific - LIBYUV and/or litten endian 

In future, PPC version should use ldbrx and stdbrx for these.

Original comment by fbarch...@chromium.org on 7 Jan 2013 at 9:24

GoogleCodeExporter commented 9 years ago
Even better could be to use bswap_32() from <byteswap.h>, glibc will do the 
right thing itself.

Original comment by d...@danny.cz on 7 Jan 2013 at 12:32

GoogleCodeExporter commented 9 years ago
PowerPC has loads and stores that do 'byte reverse' (little endian), but not a 
byte swap like x86.  I'm not aware of a common intrinsic for that.
So I prefer a load/store paradigm that can be implemented with byte reverse 
load/store or swap.
On mips, the unaligned memory access fault can be worked around with an 
attribute packed on fields of a struct - gcc will generate the masking.  But 
I'm not aware of an attribute for byte ordering.

I wasn't sure if forcing an endian is the correct thing to do.  But looking at 
this page, it does seem so
http://www.linuxtv.org/pipermail/linux-dvb/2007-June/018766.html
For 565 16 bit, I've implemented RGBP, which is 'le'.  But there is also RGBR 
for 'be':
FORMAT: index=3, type=1, flags=0, description='16 bpp RGB, le'
        fourcc=RGBP
FORMAT: index=4, type=1, flags=0, description='16 bpp RGB, be'
        fourcc=RGBR
I've mostly used fourcc's from fourcc.org.

In r525 I've moved the macros to row.h, which is more internal, but I'll move 
the row functions that use it to row_common.cc

Original comment by fbarch...@google.com on 7 Jan 2013 at 8:54

GoogleCodeExporter commented 9 years ago
This is what I had in mind
http://hardwarebug.org/2008/10/25/gcc-inline-asm-annoyance/

static inline uint32_t load_le32(const uint32_t *p)
{
    uint32_t v;
    asm ("lwbrx %0, %y1" : "=r"(v) : "Z"(*p));
    return v;
}

Will that work?

Original comment by fbarch...@google.com on 7 Jan 2013 at 9:58

GoogleCodeExporter commented 9 years ago
r528 moves READWORD/WRITEWORD to row_common.cc
avoids potential clash with external applications.

Original comment by fbarch...@chromium.org on 9 Jan 2013 at 7:05

GoogleCodeExporter commented 9 years ago
An example function for S390X

static inline __u32 ___arch__swab32p(const __u32 *x)
{
        __u32 result;

        asm volatile(
#ifndef __s390x__
                "       icm     %0,8,3(%1)\n"
                "       icm     %0,4,2(%1)\n"
                "       icm     %0,2,1(%1)\n"
                "       ic      %0,0(%1)"
                : "=&d" (result) : "a" (x), "m" (*x) : "cc");
#else /* __s390x__ */
                "       lrv     %0,%1"
                : "=d" (result) : "m" (*x));
#endif /* __s390x__ */
        return result;
}

Original comment by fbarch...@google.com on 11 Jan 2013 at 2:58

GoogleCodeExporter commented 9 years ago
r536 removed READWORD.
marking as fixed.

Original comment by fbarch...@google.com on 12 Jan 2013 at 12:03