myrao / libyuv

Automatically exported from code.google.com/p/libyuv
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

NaCL Neon support #415

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Build a test app with nacl and run the validator.

Original issue reported on code.google.com by fbarch...@chromium.org on 19 Mar 2015 at 8:46

GoogleCodeExporter commented 9 years ago
For r1342 these are the validator errors

set NACL_SDK_ROOT=d:/src/nacl_sdk/pepper_canary/
d:\src\nacl_sdk\pepper_canary\tools\make CONFIG=Release TESTING=1 V=1 
NACL_CFLAGS="-I./include" NACL_CXXFLAGS="-I./include" X86_32_CXXFLAGS
="-DGCC_HAS_AVX2" X86_64_CXXFLAGS="-DGCC_HAS_AVX2"
python d:/src/nacl_sdk/pepper_canary//tools/create_nmf.py -s ./newlib/Release 
-o newlib/Release/nacltest.nmf newlib/Release/nacltest_x86_32.nexe 
newlib/Release/nacltest_x86_64.nexe newlib/Release/nacltest_arm.nexe

d:\src\nacl_sdk\pepper_canary\tools\ncval.exe newlib/Release/nacltest_arm.nexe
   28ef4: Load/store base r12 is not properly masked.
   28ef8: Load/store base r4 is not properly masked.
   28efc: Load/store base r5 is not properly masked.
   28f00: Load/store base r5 is not properly masked.
   28f04: Load/store base r5 is not properly masked.
   28f08: Load/store base r6 is not properly masked.
   28ff4: Load/store base r12 is not properly masked.
   28ff8: Load/store base r4 is not properly masked.
   28ffc: Load/store base r5 is not properly masked.
   29000: Load/store base r5 is not properly masked.
   29004: Load/store base r5 is not properly masked.
   29008: Load/store base r6 is not properly masked.
   290e4: Load/store base r12 is not properly masked.
   290e8: Load/store base r4 is not properly masked.
   290ec: Load/store base r5 is not properly masked.
   290f0: Load/store base r5 is not properly masked.
   290f4: Load/store base r5 is not properly masked.
   290f8: Load/store base r6 is not properly masked.
   291e4: Load/store base r12 is not properly masked.
   291e8: Load/store base r4 is not properly masked.
   291ec: Load/store base r5 is not properly masked.
   291f0: Load/store base r5 is not properly masked.
   291f4: Load/store base r5 is not properly masked.
   291f8: Load/store base r6 is not properly masked.
   292d4: Load/store base r12 is not properly masked.
   292d8: Load/store base r4 is not properly masked.
   292dc: Load/store base r5 is not properly masked.
   292e0: Load/store base r5 is not properly masked.
   292e4: Load/store base r5 is not properly masked.
   292e8: Load/store base r6 is not properly masked.
   293c4: Load/store base r12 is not properly masked.
   293c8: Load/store base r4 is not properly masked.
   293cc: Load/store base r5 is not properly masked.
   293d0: Load/store base r5 is not properly masked.
   293d4: Load/store base r5 is not properly masked.
   293d8: Load/store base r6 is not properly masked.
   294b4: Load/store base r12 is not properly masked.
   294b8: Load/store base r4 is not properly masked.
   294bc: Load/store base r5 is not properly masked.
   294c0: Load/store base r5 is not properly masked.
   294c4: Load/store base r5 is not properly masked.
   294c8: Load/store base r6 is not properly masked.
   295a4: Load/store base r12 is not properly masked.
   295a8: Load/store base r4 is not properly masked.
   295ac: Load/store base r5 is not properly masked.
   295b0: Load/store base r5 is not properly masked.
   295b4: Load/store base r5 is not properly masked.
   295b8: Load/store base r6 is not properly masked.
   29694: Load/store base r12 is not properly masked.
   29698: Load/store base r4 is not properly masked.
   2969c: Load/store base r5 is not properly masked.
   296a0: Load/store base r5 is not properly masked.
   296a4: Load/store base r5 is not properly masked.
   296a8: Load/store base r6 is not properly masked.
   297a4: Load/store base r12 is not properly masked.
   297a8: Load/store base r4 is not properly masked.
   297ac: Load/store base r5 is not properly masked.
   297b0: Load/store base r5 is not properly masked.
   297b4: Load/store base r5 is not properly masked.
   297b8: Load/store base r6 is not properly masked.
   298c4: Load/store base r12 is not properly masked.
   298c8: Load/store base r4 is not properly masked.
   298cc: Load/store base r5 is not properly masked.
   298d0: Load/store base r5 is not properly masked.
   298d4: Load/store base r5 is not properly masked.
   298d8: Load/store base r6 is not properly masked.
   299cc: Load/store base r3 is not properly masked.
   299d0: Load/store base r12 is not properly masked.
   299d4: Load/store base lr is not properly masked.
   299d8: Load/store base lr is not properly masked.
   299dc: Load/store base lr is not properly masked.
   299e0: Load/store base r4 is not properly masked.
   29aec: Load/store base r12 is not properly masked.
   29af0: Load/store base lr is not properly masked.
   29af4: Load/store base r4 is not properly masked.
   29af8: Load/store base r4 is not properly masked.
   29afc: Load/store base r4 is not properly masked.
   29b00: Load/store base r5 is not properly masked.
   29bdc: Load/store base r12 is not properly masked.
   29be0: Load/store base lr is not properly masked.
   29be4: Load/store base r4 is not properly masked.
   29be8: Load/store base r4 is not properly masked.
   29bec: Load/store base r4 is not properly masked.
   29bf0: Load/store base r5 is not properly masked.
   29ccc: Load/store base r12 is not properly masked.
   29cd0: Load/store base lr is not properly masked.
   29cd4: Load/store base r4 is not properly masked.
   29cd8: Load/store base r4 is not properly masked.
   29cdc: Load/store base r4 is not properly masked.
   29ce0: Load/store base r5 is not properly masked.
   29ddc: Load/store base r12 is not properly masked.
   29de0: Load/store base lr is not properly masked.
   29de4: Load/store base r4 is not properly masked.
   29de8: Load/store base r4 is not properly masked.
   29dec: Load/store base r4 is not properly masked.
   29df0: Load/store base r5 is not properly masked.
   29eec: Load/store base r3 is not properly masked.
   29ef0: Load/store base r12 is not properly masked.
   29ef4: Load/store base lr is not properly masked.
   29ef8: Load/store base lr is not properly masked.
   29efc: Load/store base lr is not properly masked.
   29f00: Load/store base r4 is not properly masked.
   29fcc: Load/store base r3 is not properly masked.
   29fd0: Load/store base r12 is not properly masked.
   29fd4: Load/store base lr is not properly masked.
   29fd8: Load/store base lr is not properly masked.
   29fdc: Load/store base lr is not properly masked.
   29fe0: Load/store base r4 is not properly masked.
Invalid.

Original comment by fbarch...@chromium.org on 23 Mar 2015 at 10:19

GoogleCodeExporter commented 9 years ago
new neon functions dont have nacl macros to clear upper 2 bits of address.

Original comment by fbarch...@chromium.org on 6 Apr 2015 at 10:22

GoogleCodeExporter commented 9 years ago
neon 32 bit code does not validate with nacl's ncval.  Needs macros for 
loads/stores.

Original comment by fbarch...@google.com on 5 May 2015 at 6:48

GoogleCodeExporter commented 9 years ago
In macro define

#define YUV422TORGB_SETUP_REG                                                  \
    "vld1.8     {d24}, [%[kUVToRB]]            \n"                             \
    "vld1.8     {d25}, [%[kUVToG]]             \n"                             \
    "vld1.16    {d26[], d27[]}, [%[kUVBiasBGR]]! \n"                           \
    "vld1.16    {d8[], d9[]}, [%[kUVBiasBGR]]!   \n"                           \
    "vld1.16    {d28[], d29[]}, [%[kUVBiasBGR]]  \n"                           \
    "vld1.32    {d30[], d31[]}, [%[kYToRgb]]     \n"

there are macros for loads. Does it result in this issue? 

Original comment by yang.zh...@arm.com on 6 May 2015 at 10:05

GoogleCodeExporter commented 9 years ago
Yes, the vld's need a bic instruction on the memory addresses, like the READ 
macros have.  e.g.

// Read 8 Y and 4 UV from NV12
#define READNV12                                                               \
    MEMACCESS(0)                                                               \
    "vld1.8     {d0}, [%0]!                    \n"                             \
    MEMACCESS(1)                                                               \
    "vld1.8     {d2}, [%1]!                    \n"                             \
    "vmov.u8    d3, d2                         \n"/* split odd/even uv apart */\
    "vuzp.u8    d2, d3                         \n"                             \
    "vtrn.u32   d2, d3                         \n"

Original comment by fbarch...@chromium.org on 12 May 2015 at 5:45

GoogleCodeExporter commented 9 years ago
fixed in r1406

Original comment by fbarch...@chromium.org on 12 May 2015 at 9:44