Closed GoogleCodeExporter closed 9 years ago
This appears to be how to compile 64 bit for ios
GYP_DEFINES="OS=ios target_arch=armv7 target_subarch=64" GYP_CROSSCOMPILE=1
GYP_GENERATOR_FLAGS="output_dir=out_ios" ./build/gyp_chromium -f ninja
--depth=. libyuv_test.gyp
ninja -j7 -C out_ios/Debug-iphoneos
first 2 issues are q0 is reserved
../../source/row_neon.cc:136:23: error: unknown register name 'q0' in asm
: "cc", "memory", "q0", "q1", "q2", "q3",
and general purpose registers are renamed
../../source/rotate_neon.cc:183:23: error: unknown register name 'r9' in asm
: "memory", "cc", "r9", "q0", "q1", "q2", "q3"
^
../../source/rotate_neon.cc:396:23: error: unknown register name 'r9' in asm
: "memory", "cc", "r9",
Original comment by fbarch...@chromium.org
on 28 Mar 2014 at 10:52
Fixed in r994
Followup work needed for Neon version, but users are able to do the build now.
Original comment by fbarch...@google.com
on 1 Apr 2014 at 5:48
Updating to new tool chains has introduced some build bot issues
http://build.chromium.org/p/tryserver.libyuv/builders/linux_asan/builds/441/step
s/compile/logs/stdio
http://build.chromium.org/p/tryserver.libyuv/builders/ios_rel/builds/179/steps/c
ompile/logs/stdio
http://build.chromium.org/p/tryserver.libyuv/builders/mac/builds/440/steps/compi
le/logs/stdio
Original comment by fbarch...@chromium.org
on 2 Apr 2014 at 7:41
r998 replaces r9 register to %0 parameter which will map to x9 for arm64
Original comment by fbarch...@chromium.org
on 3 Apr 2014 at 6:48
r1000 fixes 64 bit clang builds
Original comment by fbarch...@chromium.org
on 14 Apr 2014 at 4:40
Original comment by fbarch...@chromium.org
on 23 May 2014 at 10:06
Original comment by fbarch...@chromium.org
on 11 Jun 2014 at 9:06
Partially fixed - gpr pointers/registers fixed.
neon registers do not overlap like they used to, and this issue will affect
some functions, but most will be affected in the registers declared, not the
code.
Original comment by fbarch...@chromium.org
on 27 Jun 2014 at 1:01
Hi, there,
I'm looking at libYUV's ARMv8 Neon optimization enabling. I wonder is there
something I can do for this? Such as converting the ARMv7 neon optimized to
ARMv8 neon optimized code for the functions in following files:
compare_neon.cc, rotate_neon.cc row_neon.cc scale_neon.cc
Original comment by zhongwei...@arm.com
on 28 Jul 2014 at 3:51
We're at about 40% complete first pass armv8 conversion.
Source files are *_neon64.cc
An overall second pass should be done to bump all registers to 16 bytes instead
of 8 byte, which was an armv7 restriction.
Original comment by fbarch...@google.com
on 23 Aug 2014 at 1:26
A metric for completeness is number of _NEON functions in 64 bit vs 32 bit.
For 32 bit:
otool -tV libyuv_neon.*_neon.o | grep NEON: | wc -l
105
For 64 bit:
otool -tV libyuv_neon.*_neon64.arm64.o | grep NEON: | wc -l
105
Looks like the initial port is complete.
Followup needed for
1. test it actually works
2. compare performance is on par
3. optimize for 64 bit - can do 16 pixels at a time instead of 8.
4. port more functions to neon. all functions that are optimized for intel
should have a neon equivalent.
on intel scale has 22 optimized functions. neon has 15
Original comment by fbarch...@google.com
on 14 Oct 2014 at 1:11
Scale for Intel: 22 functions
objdump -D libyuv.scale_posix.o | grep text.*SSE.*:
ScaleRowDown2_SSE2:
ScaleRowDown2Linear_SSE2:
ScaleRowDown2Box_SSE2:
ScaleRowDown4_SSE2:
ScaleRowDown4Box_SSE2:
ScaleRowDown34_SSSE3:
ScaleRowDown34_1_Box_SSSE3:
ScaleRowDown34_0_Box_SSSE3:
ScaleRowDown38_SSSE3:
ScaleRowDown38_2_Box_SSSE3:
ScaleRowDown38_3_Box_SSSE3:
ScaleAddRows_SSE2:
ScaleFilterCols_SSSE3:
ScaleColsUp2_SSE2:
ScaleARGBRowDown2_SSE2:
ScaleARGBRowDown2Linear_SSE2:
ScaleARGBRowDown2Box_SSE2:
ScaleARGBRowDownEven_SSE2:
ScaleARGBRowDownEvenBox_SSE2:
ScaleARGBCols_SSE2:
ScaleARGBColsUp2_SSE2:
ScaleARGBFilterCols_SSSE3:
On Arm: 15 functions
otool -tV libyuv_neon.scale_neon64.arm64.o | grep NEON:
_ScaleRowDown2_NEON:
_ScaleRowDown2Box_NEON:
_ScaleRowDown4_NEON:
_ScaleRowDown4Box_NEON:
_ScaleRowDown34_NEON:
_ScaleRowDown34_0_Box_NEON:
_ScaleRowDown34_1_Box_NEON:
_ScaleRowDown38_NEON:
_ScaleRowDown38_3_Box_NEON:
_ScaleRowDown38_2_Box_NEON:
_ScaleFilterRows_NEON:
_ScaleARGBRowDown2_NEON:
_ScaleARGBRowDown2Box_NEON:
_ScaleARGBRowDownEven_NEON:
_ScaleARGBRowDownEvenBox_NEON:
Original comment by fbarch...@google.com
on 14 Oct 2014 at 1:15
Original issue reported on code.google.com by
fbarch...@chromium.org
on 21 Mar 2014 at 5:39