jungle0755 / libyuv

Automatically exported from code.google.com/p/libyuv
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

ARM I420AlphaToABGR_Opt optimize for Neon #516

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
I420AlphaToARGB_Opt is using C code on Arm.  Optimize for Neon.

runyuva10 ConvertTest*I420*ToA???_*

util/android/test_runner.py gtest -s libyuv_unittest -t 7200 --verbose 
--release --gtest_filter=*ConvertTest*I420*ToA???_* -a "--libyuv_width=1280 
--libyuv_height=720 --libyuv_repeat=999 --libyuv_flags=-1"

I420AlphaToARGB_Premult (77882 ms)
I420AlphaToABGR_Premult (77382 ms)
I420AlphaToARGB_Invert (76718 ms)
I420AlphaToARGB_Unaligned (76620 ms)
I420AlphaToARGB_Opt (76488 ms)
I420AlphaToARGB_Any (76200 ms)
I420AlphaToABGR_Opt (74689 ms)
I420AlphaToABGR_Any (74684 ms)
I420AlphaToABGR_Invert (74532 ms)
I420AlphaToABGR_Unaligned (74470 ms)
I420ToARGB_Unaligned (3966 ms)
I420ToABGR_Any (3649 ms)
I420ToARGB_Any (3635 ms)
I420ToABGR_Invert (3547 ms)
I420ToABGR_Unaligned (3504 ms)
I420ToABGR_Opt (3370 ms)
I420ToARGB_Invert (3364 ms)
I420ToARGB_Opt (3304 ms)

Original issue reported on code.google.com by fbarch...@chromium.org on 26 Oct 2015 at 10:15

GoogleCodeExporter commented 8 years ago
ARMv7 slowest conversions:

util/android/test_runner.py gtest -s libyuv_unittest -t 7200 --verbose 
--release --gtest_filter=*ConvertTest* -a "--libyuv_width=1280 
--libyuv_height=720 --libyuv_repeat=999 --libyuv_flags=-1" | grep ms | sed 
's/\(.*(\)\([0-9]*\)\( ms)\)/\2 - \1\2\3/g' | sort -rn | sed 's/.*- \(.*\)/\1/g'

LibYUVConvertTest.I420AlphaToARGB_Premult (78211 ms)
LibYUVConvertTest.I420AlphaToABGR_Premult (78106 ms)
LibYUVConvertTest.I420AlphaToARGB_Any (77558 ms)
LibYUVConvertTest.I420AlphaToABGR_Invert (77419 ms)
LibYUVConvertTest.I420AlphaToARGB_Unaligned (77245 ms)
LibYUVConvertTest.I420AlphaToABGR_Opt (77227 ms)
LibYUVConvertTest.I420AlphaToABGR_Unaligned (77046 ms)
LibYUVConvertTest.I420AlphaToABGR_Any (76689 ms)
LibYUVConvertTest.I420AlphaToARGB_Invert (76672 ms)
LibYUVConvertTest.I420AlphaToARGB_Opt (76534 ms)
LibYUVConvertTest.ARGBToYUY2_Any (7811 ms)
LibYUVConvertTest.ARGBToYUY2_Opt (7804 ms)
LibYUVConvertTest.ARGBToYUY2_Unaligned (7759 ms)
LibYUVConvertTest.ARGBToUYVY_Unaligned (7744 ms)
LibYUVConvertTest.ARGBToUYVY_Any (7701 ms)
LibYUVConvertTest.ARGBToUYVY_Opt (7689 ms)
LibYUVConvertTest.I420ToI444_Invert (6602 ms)
LibYUVConvertTest.I420ToI444_Any (6578 ms)
LibYUVConvertTest.I420ToI444_Unaligned (6000 ms)
LibYUVConvertTest.I420ToI444_Opt (5996 ms)
LibYUVConvertTest.I420ToRGB565Dither_Any (4833 ms)
LibYUVConvertTest.ARGBToI444_Unaligned (4556 ms)
LibYUVConvertTest.ARGB1555ToI420_Any (4515 ms)
LibYUVConvertTest.I420ToRGB565Dither_Unaligned (4460 ms)
LibYUVConvertTest.I420ToRGB565Dither_Invert (4347 ms)
LibYUVConvertTest.I420ToRGB565Dither_Opt (4318 ms)
LibYUVConvertTest.H422ToARGB_Unaligned (4267 ms)
LibYUVConvertTest.RGB565ToI420_Any (4208 ms)
LibYUVConvertTest.ARGBToI444_Opt (4143 ms)

Original comment by fbarch...@google.com on 3 Nov 2015 at 7:36

GoogleCodeExporter commented 8 years ago
The following revision refers to this bug:
  https://chromium.googlesource.com/libyuv/libyuv.git/+/c629cb3afedfbd8c88c92891b5b843db2ad9aba2

commit c629cb3afedfbd8c88c92891b5b843db2ad9aba2
Author: Frank Barchard <fbarchard@google.com>
Date: Wed Nov 04 01:01:48 2015

add command line cpu info to allow android neon test

in order to compare C and Neon code, a new command line flag is added.
historically environment variables controlled cpu features, but on
android apk it is easier to pass a command line option to disable cpu
optimizations.

R=harryjin@google.com
BUG=libyuv:516

Review URL: https://codereview.chromium.org/1407193009 .

[modify] 
http://crrev.com/c629cb3afedfbd8c88c92891b5b843db2ad9aba2/unit_test/color_test.c
c
[modify] 
http://crrev.com/c629cb3afedfbd8c88c92891b5b843db2ad9aba2/unit_test/compare_test
.cc
[modify] 
http://crrev.com/c629cb3afedfbd8c88c92891b5b843db2ad9aba2/unit_test/convert_test
.cc
[modify] 
http://crrev.com/c629cb3afedfbd8c88c92891b5b843db2ad9aba2/unit_test/cpu_test.cc
[modify] 
http://crrev.com/c629cb3afedfbd8c88c92891b5b843db2ad9aba2/unit_test/planar_test.
cc
[modify] 
http://crrev.com/c629cb3afedfbd8c88c92891b5b843db2ad9aba2/unit_test/rotate_argb_
test.cc
[modify] 
http://crrev.com/c629cb3afedfbd8c88c92891b5b843db2ad9aba2/unit_test/rotate_test.
cc
[modify] 
http://crrev.com/c629cb3afedfbd8c88c92891b5b843db2ad9aba2/unit_test/scale_argb_t
est.cc
[modify] 
http://crrev.com/c629cb3afedfbd8c88c92891b5b843db2ad9aba2/unit_test/scale_test.c
c
[modify] 
http://crrev.com/c629cb3afedfbd8c88c92891b5b843db2ad9aba2/unit_test/unit_test.cc
[modify] 
http://crrev.com/c629cb3afedfbd8c88c92891b5b843db2ad9aba2/unit_test/unit_test.h

Original comment by bugdroid1@chromium.org on 4 Nov 2015 at 1:02

GoogleCodeExporter commented 8 years ago
The following revision refers to this bug:
  https://chromium.googlesource.com/libyuv/libyuv.git/+/860cc0357a4b0d224be673c0ccfad192745a4192

commit 860cc0357a4b0d224be673c0ccfad192745a4192
Author: Frank Barchard <fbarchard@google.com>
Date: Wed Nov 04 03:21:36 2015

Neon versions of I420AlphaToARGB

Add alpha version of YUV to RGB to neon code for ARMv7 and aarch64.
For other YUV to RGB conversions, hoist alpha set to 255 out of loop.

TBR=harryjin@google.com
BUG=libyuv:516

Review URL: https://codereview.chromium.org/1413763017 .

[modify] 
http://crrev.com/860cc0357a4b0d224be673c0ccfad192745a4192/include/libyuv/row.h
[modify] 
http://crrev.com/860cc0357a4b0d224be673c0ccfad192745a4192/source/row_any.cc
[modify] 
http://crrev.com/860cc0357a4b0d224be673c0ccfad192745a4192/source/row_neon.cc
[modify] 
http://crrev.com/860cc0357a4b0d224be673c0ccfad192745a4192/source/row_neon64.cc

Original comment by bugdroid1@chromium.org on 4 Nov 2015 at 3:22

GoogleCodeExporter commented 8 years ago
fixed.

I420AlphaToARGB_Premult (2610 ms)
I420ToARGB_Any (2308 ms)
I420ToABGR_Any (2229 ms)
I420AlphaToABGR_Any (2164 ms)
I420ToABGR_ARGB_Any (2146 ms)
I420AlphaToARGB_Any (2145 ms)
I420ToARGB_RAW_Any (2143 ms)
I420ToARGB_Unaligned (2133 ms)
I420ToARGB_ABGR_Any (2127 ms)
I420ToARGB_RGB565_Any (2058 ms)
I420ToABGR_Unaligned (2013 ms)
I420ToABGR_ARGB_Unaligned (2012 ms)
I420ToARGB_ARGB1555_Any (2011 ms)
I420ToARGB_ARGB4444_Any (2005 ms)
I420ToARGB_Invert (2003 ms)
I420ToARGB_ABGR_Unaligned (1994 ms)
I420AlphaToABGR_Unaligned (1975 ms)
I420AlphaToARGB_Unaligned (1971 ms)
I420ToARGB_RAW_Unaligned (1931 ms)
I420ToARGB_Opt (1902 ms)
I420ToARGB_RGB565_Unaligned (1875 ms)
I420ToARGB_ARGB1555_Unaligned (1874 ms)
I420ToARGB_ARGB4444_Unaligned (1872 ms)
I420ToABGR_ARGB_Invert (1867 ms)
I420ToARGB_ABGR_Invert (1866 ms)
I420AlphaToABGR_Invert (1841 ms)
I420ToABGR_ARGB_Opt (1836 ms)
I420ToABGR_Invert (1836 ms)
I420AlphaToARGB_Opt (1826 ms)
I420AlphaToARGB_Invert (1826 ms)
I420AlphaToABGR_Opt (1824 ms)
I420ToARGB_ABGR_Opt (1822 ms)
I420ToABGR_Opt (1805 ms)
I420ToARGB_RAW_Invert (1750 ms)
I420ToARGB_RGB565_Invert (1739 ms)
I420ToARGB_ARGB1555_Invert (1733 ms)
I420ToARGB_ARGB4444_Invert (1732 ms)
I420ToARGB_RAW_Opt (1731 ms)
I420ToARGB_ARGB4444_Opt (1724 ms)
I420ToARGB_RGB565_Opt (1719 ms)
I420ToARGB_ARGB1555_Opt (1712 ms)

Original comment by fbarch...@chromium.org on 5 Nov 2015 at 1:18