BigMing001 / libyuv

Automatically exported from code.google.com/p/libyuv
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

linux top bottlenecks #492

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
Investigate top bottlenecks

LIBYUV_DISABLE_AVX2=1 LIBYUV_WIDTH=1280 LIBYUV_HEIGHT=720 LIBYUV_REPEAT=1000 
perf record out/Release/libyuv_unittest --gtest_filter=*
perf report

 13.81%  libyuv_unittest  libyuv_unittest      [.] libyuv::libyuvTest_ScaleTestRoundToByte_Test::T◆
 13.81%  libyuv_unittest  libyuv_unittest      [.] libyuv::libyuvTest_TestRoundToByte_Test::TestBo▒
  4.94%  libyuv_unittest  libyuv_unittest      [.] ScaleAddRow_C                                  ▒
  4.07%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBRowDownEvenBox_SSE2                   ▒
  3.63%  libyuv_unittest  libyuv_unittest      [.] InterpolateRow_SSSE3                           ▒
  3.57%  libyuv_unittest  libyuv_unittest      [.] I422ToARGBMatrixRow_SSSE3                      ▒
  3.06%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBFilterCols_SSSE3                      ▒
  3.02%  libyuv_unittest  libyuv_unittest      [.] ScaleFilterCols_SSSE3                          ▒
  2.63%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBRowDown2Box_SSE2                      ▒
  2.58%  libyuv_unittest  libyuv_unittest      [.] libyuv::ScaleARGB(unsigned char const*, int, in▒
  2.57%  libyuv_unittest  libyuv_unittest      [.] ARGBToRGB565DitherRow_C                        ▒
  2.45%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBCols_SSE2                             ▒
  2.44%  libyuv_unittest  libc-2.19.so         [.] __random_r                                     ▒
  2.23%  libyuv_unittest  libyuv_unittest      [.] CopyRow_ERMS                                   ▒
  1.64%  libyuv_unittest  libyuv_unittest      [.] I422ToABGRMatrixRow_SSSE3                      ▒
  1.46%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBRowDownEven_SSE2                      ▒
  1.29%  libyuv_unittest  libyuv_unittest      [.] FixedDiv_X86                                   ▒
  1.26%  libyuv_unittest  libyuv_unittest      [.] libyuv::ScaleAddCols1_C(int, int, int, int, uns▒
  1.24%  libyuv_unittest  libyuv_unittest      [.] ARGBShuffleRow_SSSE3                           ▒
  1.21%  libyuv_unittest  libyuv_unittest      [.] CumulativeSumToAverageRow_SSE2                 ▒
  1.14%  libyuv_unittest  libc-2.19.so         [.] __random                                       ▒
  1.08%  libyuv_unittest  libyuv_unittest      [.] ScaleCols_C                                    ▒
  0.99%  libyuv_unittest  libyuv_unittest      [.] ARGBToYRow_SSSE3                               ▒
  0.75%  libyuv_unittest  libyuv_unittest      [.] ComputeCumulativeSumRow_SSE2                   ▒
  0.75%  libyuv_unittest  libc-2.19.so         [.] _int_malloc       

Original issue reported on code.google.com by fbarch...@google.com on 16 Sep 2015 at 11:36

GoogleCodeExporter commented 8 years ago
r1483 removes redundent scale rounding test.

Rounding test is still top bottleneck though on linux.

 16.52%  libyuv_unittest  libyuv_unittest      [.] libyuv::libyuvTest_TestRoundToByte_Test::TestBody()

Original comment by fbarch...@google.com on 17 Sep 2015 at 5:28

GoogleCodeExporter commented 8 years ago
The following is a complete list of C functions (there should be none)

LIBYUV_FLAGS=-1 LIBYUV_WIDTH=1280 LIBYUV_HEIGHT=720 LIBYUV_REPEAT=999 perf 
record out/Release/libyuv_unittest --gtest_filter=*
perf report >out.txt
grep _C out.txt

     5.88%  libyuv_unittest  libyuv_unittest      [.] ScaleAddRow_C
     3.08%  libyuv_unittest  libyuv_unittest      [.] ARGBToRGB565DitherRow_C
     1.38%  libyuv_unittest  libyuv_unittest      [.] libyuv::ScaleAddCols1_C(int, int, int, int, unsigned short const*, unsigned char*)
     1.28%  libyuv_unittest  libyuv_unittest      [.] ScaleCols_C
     0.52%  libyuv_unittest  libyuv_unittest      [.] ARGBToUV411Row_C
     0.25%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBRowDownEven_C
     0.14%  libyuv_unittest  libyuv_unittest      [.] libyuv::ScaleAddCols2_C(int, int, int, int, unsigned short const*, unsigned char*)
     0.07%  libyuv_unittest  libyuv_unittest      [.] ScaleColsUp2_C
     0.03%  libyuv_unittest  libyuv_unittest      [.] MirrorUVRow_C
     0.01%  libyuv_unittest  libyuv_unittest      [.] TransposeWx8_C
     0.01%  libyuv_unittest  libyuv_unittest      [.] TransposeWxH_C
     0.01%  libyuv_unittest  libyuv_unittest      [.] ScaleRowDown34_0_Box_C
     0.00%  libyuv_unittest  libyuv_unittest      [.] ScaleRowDown34_1_Box_C
     0.00%  libyuv_unittest  libyuv_unittest      [.] TransposeUVWx8_C
     0.00%  libyuv_unittest  libyuv_unittest      [.] ScaleRowDown38_3_Box_C
     0.00%  libyuv_unittest  libyuv_unittest      [.] ScaleRowDown2Box_C
     0.00%  libyuv_unittest  libyuv_unittest      [.] ScaleRowDown34_C
     0.00%  libyuv_unittest  libyuv_unittest      [.] ScaleRowDown38_2_Box_C
     0.00%  libyuv_unittest  libyuv_unittest      [.] libyuv::libyuvTest_CropNV12_Test::TestBody()
     0.00%  libyuv_unittest  libyuv_unittest      [.] ScaleRowDown38_C
     0.00%  libyuv_unittest  libyuv_unittest      [.] ARGBToUVJ422Row_C

Original comment by fbarch...@google.com on 17 Sep 2015 at 6:35

GoogleCodeExporter commented 8 years ago
LIBYUV_FLAGS=-1 LIBYUV_WIDTH=1280 LIBYUV_HEIGHT=720 LIBYUV_REPEAT=999 perf 
record out/Release/libyuv_unittest --gtest_filter=*

    18.31%  libyuv_unittest  libyuv_unittest      [.] libyuv::libyuvTest_TestRoundToByte_Test::TestBody()
     6.47%  libyuv_unittest  libyuv_unittest      [.] ScaleAddRow_C
     5.05%  libyuv_unittest  libyuv_unittest      [.] InterpolateRow_AVX2
     4.81%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBRowDownEvenBox_SSE2
     3.64%  libyuv_unittest  libyuv_unittest      [.] ScaleFilterCols_SSSE3
     3.43%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBFilterCols_SSSE3
     3.08%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBRowDown2Box_SSE2
     3.00%  libyuv_unittest  libyuv_unittest      [.] ScaleARGB
     2.86%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBCols_SSE2
     2.83%  libyuv_unittest  libyuv_unittest      [.] ARGBToRGB565DitherRow_C
     2.69%  libyuv_unittest  libyuv_unittest      [.] CopyRow_ERMS
     2.59%  libyuv_unittest  libyuv_unittest      [.] I422ToARGBRow_AVX2
     1.72%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBRowDownEven_SSE2
     1.60%  libyuv_unittest  libyuv_unittest      [.] FixedDiv_X86
     1.48%  libyuv_unittest  libyuv_unittest      [.] ARGBShuffleRow_AVX2
     1.47%  libyuv_unittest  libyuv_unittest      [.] CumulativeSumToAverageRow_SSE2
     1.45%  libyuv_unittest  libyuv_unittest      [.] I422ToABGRRow_AVX2
     1.40%  libyuv_unittest  libyuv_unittest      [.] ScaleCols_C
     1.30%  libyuv_unittest  libyuv_unittest      [.] ScaleAddCols1_C
     1.08%  libyuv_unittest  libyuv_unittest      [.] NV12ToARGBRow_SSSE3

Original comment by fbarch...@google.com on 23 Sep 2015 at 8:27

GoogleCodeExporter commented 8 years ago
NV12ToARGB optimized
    18.25%  libyuv_unittest  libyuv_unittest      [.] libyuv::libyuvTest_TestRoundToByte_Test::TestBody()
     6.50%  libyuv_unittest  libyuv_unittest      [.] ScaleAddRow_C
     5.16%  libyuv_unittest  libyuv_unittest      [.] InterpolateRow_AVX2
     4.83%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBRowDownEvenBox_SSE2
     3.64%  libyuv_unittest  libyuv_unittest      [.] ScaleFilterCols_SSSE3
     3.42%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBFilterCols_SSSE3
     3.15%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBRowDown2Box_SSE2
     3.00%  libyuv_unittest  libyuv_unittest      [.] ScaleARGB
     2.92%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBCols_SSE2
     2.83%  libyuv_unittest  libyuv_unittest      [.] ARGBToRGB565DitherRow_C
     2.69%  libyuv_unittest  libyuv_unittest      [.] CopyRow_ERMS
     2.59%  libyuv_unittest  libyuv_unittest      [.] I422ToARGBRow_AVX2
     1.75%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBRowDownEven_SSE2
     1.61%  libyuv_unittest  libyuv_unittest      [.] FixedDiv_X86
     1.49%  libyuv_unittest  libyuv_unittest      [.] ARGBShuffleRow_AVX2
     1.48%  libyuv_unittest  libyuv_unittest      [.] CumulativeSumToAverageRow_SSE2
     1.45%  libyuv_unittest  libyuv_unittest      [.] I422ToABGRRow_AVX2
     1.40%  libyuv_unittest  libyuv_unittest      [.] ScaleCols_C
     1.26%  libyuv_unittest  libyuv_unittest      [.] ScaleAddCols1_C
     0.93%  libyuv_unittest  libyuv_unittest      [.] libyuv::ARGBClipTestFilter(int, int, int, int, libyuv::FilterMode, int)
     0.92%  libyuv_unittest  libc-2.19.so         [.] _int_malloc
     0.91%  libyuv_unittest  libyuv_unittest      [.] ComputeCumulativeSumRow_SSE2
     0.85%  libyuv_unittest  libyuv_unittest      [.] ARGBToRGB565Row_SSE2
     0.85%  libyuv_unittest  libyuv_unittest      [.] ARGBToYRow_AVX2
     0.83%  libyuv_unittest  libyuv_unittest      [.] I422ToARGBRow_SSSE3
     0.68%  libyuv_unittest  libyuv_unittest      [.] SobelXRow_SSE2
     0.67%  libyuv_unittest  libyuv_unittest      [.] SobelYRow_SSE2
     0.62%  libyuv_unittest  libyuv_unittest      [.] TransposeWx8_Fast_SSSE3
     0.62%  libyuv_unittest  libyuv_unittest      [.] FixedDiv1_X86
     0.61%  libyuv_unittest  libyuv_unittest      [.] ScaleSlope
     0.57%  libyuv_unittest  libyuv_unittest      [.] next_marker
     0.54%  libyuv_unittest  libyuv_unittest      [.] NV12ToARGBRow_SSSE3

Original comment by fbarch...@google.com on 25 Sep 2015 at 7:31

GoogleCodeExporter commented 8 years ago
NV12 AVX2
 18.25%  libyuv_unittest  libyuv_unittest      [.] libyuv::libyuvTest_TestRoundToByte_Test::TestBody()
  6.53%  libyuv_unittest  libyuv_unittest      [.] ScaleAddRow_C
  5.08%  libyuv_unittest  libyuv_unittest      [.] InterpolateRow_AVX2
  4.84%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBRowDownEvenBox_SSE2
  3.64%  libyuv_unittest  libyuv_unittest      [.] ScaleFilterCols_SSSE3
  3.42%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBFilterCols_SSSE3
  3.12%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBRowDown2Box_SSE2
  3.00%  libyuv_unittest  libyuv_unittest      [.] ScaleARGB
  2.90%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBCols_SSE2
  2.85%  libyuv_unittest  libyuv_unittest      [.] ARGBToRGB565DitherRow_C
  2.71%  libyuv_unittest  libyuv_unittest      [.] CopyRow_ERMS
  2.38%  libyuv_unittest  libyuv_unittest      [.] I422ToARGBRow_AVX2
  1.76%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBRowDownEven_SSE2
  1.62%  libyuv_unittest  libyuv_unittest      [.] FixedDiv_X86
  1.49%  libyuv_unittest  libyuv_unittest      [.] CumulativeSumToAverageRow_SSE2
  1.49%  libyuv_unittest  libyuv_unittest      [.] ARGBShuffleRow_AVX2
  1.41%  libyuv_unittest  libyuv_unittest      [.] ScaleCols_C
  1.25%  libyuv_unittest  libyuv_unittest      [.] ScaleAddCols1_C
  1.25%  libyuv_unittest  libyuv_unittest      [.] I422ToABGRRow_AVX2
  0.99%  libyuv_unittest  libc-2.19.so         [.] _int_malloc
  0.92%  libyuv_unittest  libyuv_unittest      [.] libyuv::ARGBClipTestFilter(int, int, int, int, libyuv::FilterMode, int)
  0.91%  libyuv_unittest  libyuv_unittest      [.] ComputeCumulativeSumRow_SSE2
  0.87%  libyuv_unittest  libyuv_unittest      [.] ARGBToRGB565Row_SSE2
  0.85%  libyuv_unittest  libyuv_unittest      [.] ARGBToYRow_AVX2
  0.84%  libyuv_unittest  libyuv_unittest      [.] I422ToARGBRow_SSSE3
  0.68%  libyuv_unittest  libyuv_unittest      [.] SobelXRow_SSE2
  0.67%  libyuv_unittest  libyuv_unittest      [.] SobelYRow_SSE2
  0.62%  libyuv_unittest  libyuv_unittest      [.] TransposeWx8_Fast_SSSE3
  0.62%  libyuv_unittest  libyuv_unittest      [.] ScaleSlope
  0.62%  libyuv_unittest  libyuv_unittest      [.] FixedDiv1_X86
  0.55%  libyuv_unittest  libyuv_unittest      [.] next_marker
  0.54%  libyuv_unittest  libyuv_unittest      [.] NV12ToARGBRow_SSSE3
  0.54%  libyuv_unittest  libyuv_unittest      [.] ARGBToUV411Row_C
  0.50%  libyuv_unittest  libyuv_unittest      [.] ARGBToARGB1555Row_SSE2
  0.48%  libyuv_unittest  libyuv_unittest      [.] ARGBScaleClip
  0.47%  libyuv_unittest  libyuv_unittest      [.] ARGBToUVRow_AVX2
  0.46%  libyuv_unittest  libyuv_unittest      [.] ARGBToYJRow_AVX2
  0.45%  libyuv_unittest  libyuv_unittest      [.] InterpolateRow_Any_AVX2
  0.43%  libyuv_unittest  libyuv_unittest      [.] ARGBToUV422Row_SSSE3
  0.42%  libyuv_unittest  libyuv_unittest      [.] I422ToBGRARow_AVX2
  0.41%  libyuv_unittest  libyuv_unittest      [.] I422ToRGBARow_AVX2
  0.40%  libyuv_unittest  libc-2.19.so         [.] _int_free
  0.40%  libyuv_unittest  libyuv_unittest      [.] NV12ToARGBRow_AVX2

Original comment by fbarch...@google.com on 25 Sep 2015 at 11:57

GoogleCodeExporter commented 8 years ago
TestRoundToByte is too slow.  Improve rounding and/or test

LIBYUV_REPEAT=100 out/Release/libyuv_unittest 
--gtest_filter=libyuvTest.TestRoundToByte
Note: Google Test filter = libyuvTest.TestRoundToByte
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from libyuvTest
[ RUN      ] libyuvTest.TestRoundToByte
[       OK ] libyuvTest.TestRoundToByte (10731 ms)
[----------] 1 test from libyuvTest (10731 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test case ran. (10731 ms total)
[  PASSED  ] 1 test.

Original comment by fbarch...@google.com on 2 Oct 2015 at 6:01

GoogleCodeExporter commented 8 years ago
LIBYUV_WIDTH=640 LIBYUV_HEIGHT=360 LIBYUV_REPEAT=4000 
out/Release/libyuv_unittest --gtest_filter=**TestRoundToByte* 
[       OK ] libyuvTest.TestRoundToByte (419442 ms)
[----------] 1 test from libyuvTest (419442 ms total)

Performance of 4 rounding methods on Linux GCC:

#define ROUND(f) static_cast<int>(f + 0.5)
TestRoundToByte (10731 ms)

#define ROUND(f) lrintf(f)
TestRoundToByte (7911 ms)

#define ROUND(f) static_cast<int>(round(f))
TestRoundToByte (12700 ms)

#define ROUND(f) _mm_cvt_ss2si(_mm_load_ss(&f))
TestRoundToByte (10428 ms)

Original comment by fbarch...@google.com on 2 Oct 2015 at 6:19

GoogleCodeExporter commented 8 years ago
     7.94%  libyuv_unittest  libyuv_unittest      [.] ScaleAddRow_C
     6.08%  libyuv_unittest  libyuv_unittest      [.] InterpolateRow_AVX2
     6.04%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBRowDownEvenBox_SSE2
     4.46%  libyuv_unittest  libyuv_unittest      [.] ScaleFilterCols_SSSE3
     4.15%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBFilterCols_SSSE3
     3.87%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBRowDown2Box_SSE2
     3.69%  libyuv_unittest  libyuv_unittest      [.] ScaleARGB
     3.63%  libyuv_unittest  libyuv_unittest      [.] ARGBToRGB565DitherRow_C
     3.53%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBCols_SSE2
     3.31%  libyuv_unittest  libyuv_unittest      [.] CopyRow_ERMS
     2.91%  libyuv_unittest  libyuv_unittest      [.] I422ToARGBRow_AVX2
     2.15%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBRowDownEven_SSE2
     1.95%  libyuv_unittest  libyuv_unittest      [.] FixedDiv_X86
     1.83%  libyuv_unittest  libyuv_unittest      [.] ARGBShuffleRow_AVX2
     1.80%  libyuv_unittest  libyuv_unittest      [.] CumulativeSumToAverageRow_SSE2
     1.71%  libyuv_unittest  libyuv_unittest      [.] ScaleCols_C
     1.57%  libyuv_unittest  libyuv_unittest      [.] ScaleAddCols1_C
     1.52%  libyuv_unittest  libyuv_unittest      [.] I422ToABGRRow_AVX2
     1.13%  libyuv_unittest  libc-2.19.so         [.] _int_malloc
     1.12%  libyuv_unittest  libyuv_unittest      [.] libyuv::ARGBClipTestFilter(int, int, int, int, libyuv::FilterMode, int)
     1.12%  libyuv_unittest  libyuv_unittest      [.] ComputeCumulativeSumRow_SSE2
     1.05%  libyuv_unittest  libyuv_unittest      [.] ARGBToRGB565Row_SSE2
     1.03%  libyuv_unittest  libyuv_unittest      [.] ARGBToYRow_AVX2
     1.02%  libyuv_unittest  libyuv_unittest      [.] I422ToARGBRow_SSSE3

Original comment by fbarch...@chromium.org on 2 Oct 2015 at 11:03

GoogleCodeExporter commented 8 years ago
r1502 performance
     6.66%  libyuv_unittest  libyuv_unittest      [.] InterpolateRow_AVX2
     6.48%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBRowDownEvenBox_SSE2
     4.77%  libyuv_unittest  libyuv_unittest      [.] ScaleFilterCols_SSSE3
     4.46%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBFilterCols_SSSE3
     4.14%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBRowDown2Box_SSE2
     3.96%  libyuv_unittest  libyuv_unittest      [.] ScaleARGB
     3.76%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBCols_SSE2
     3.71%  libyuv_unittest  libyuv_unittest      [.] ARGBToRGB565DitherRow_C
     3.57%  libyuv_unittest  libyuv_unittest      [.] CopyRow_ERMS
     3.12%  libyuv_unittest  libyuv_unittest      [.] I422ToARGBRow_AVX2
     2.29%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBRowDownEven_SSE2
     2.13%  libyuv_unittest  libyuv_unittest      [.] FixedDiv_X86
     1.95%  libyuv_unittest  libyuv_unittest      [.] ARGBShuffleRow_AVX2
     1.92%  libyuv_unittest  libyuv_unittest      [.] CumulativeSumToAverageRow_SSE2
     1.80%  libyuv_unittest  libyuv_unittest      [.] ScaleAddRow_AVX2
     1.75%  libyuv_unittest  libyuv_unittest      [.] ScaleCols_C
     1.63%  libyuv_unittest  libyuv_unittest      [.] I422ToABGRRow_AVX2
     1.47%  libyuv_unittest  libyuv_unittest      [.] ScaleAddCols1_C
     1.22%  libyuv_unittest  libc-2.19.so         [.] _int_malloc
     1.21%  libyuv_unittest  libyuv_unittest      [.] libyuv::ARGBClipTestFilter(int, int, int, int, libyuv::FilterMode, int)
     1.21%  libyuv_unittest  libyuv_unittest      [.] ComputeCumulativeSumRow_SSE2
     1.11%  libyuv_unittest  libyuv_unittest      [.] ARGBToYRow_AVX2
     1.08%  libyuv_unittest  libyuv_unittest      [.] I422ToARGBRow_SSSE3
     0.90%  libyuv_unittest  libyuv_unittest      [.] SobelXRow_SSE2

Original comment by fbarch...@chromium.org on 7 Oct 2015 at 5:47

GoogleCodeExporter commented 8 years ago
fbarchard@fbarchard-linux:~/src/libyuv/libyuv$ runyuv10 | more
LIBYUV_WIDTH=640 LIBYUV_HEIGHT=360 LIBYUV_REPEAT=4000 
out/Release/libyuv_unittest --gtest_filter=* | grep ms | sed 
's/\(.*(\)\([0-9]*\)\( ms)\)/\2 - \1\2\3/g' | sort -rn
| sed 's/.*- \(.*\)/\1/g'
[       OK ] libyuvTest.ARGBScaleClipTo1280x720_Linear (11452 ms)
[  FAILED  ] libyuvTest.ScaleDownBy8_Box (10933 ms)
[       OK ] libyuvTest.ARGBScaleClipTo1280x720_Bilinear (9219 ms)
[       OK ] libyuvTest.ARGBScaleDownClipBy4_Box (6844 ms)
[       OK ] libyuvTest.ARGBScaleDownClipBy3by4_Box (5228 ms)
[       OK ] libyuvTest.ARGBScaleDownClipBy3by4_Bilinear (5218 ms)
[       OK ] libyuvTest.ARGBScaleClipTo1280x720_None (4465 ms)
[       OK ] libyuvTest.ARGBScaleDownClipBy3by8_Box (3887 ms)
[       OK ] libyuvTest.ARGBScaleClipTo569x480_Linear (3768 ms)
[       OK ] libyuvTest.ARGBScaleDownClipBy8_Box (3407 ms)
[       OK ] libyuvTest.ARGBScaleDownClipBy8_Bilinear (3346 ms)
[       OK ] libyuvTest.ARGBScaleClipFrom569x480_Bilinear (3327 ms)
[       OK ] libyuvTest.ARGBScaleClipFrom352x288_Linear (3257 ms)
[       OK ] libyuvTest.ARGBScaleClipTo569x480_Bilinear (3215 ms)
[       OK ] libyuvTest.ARGBScaleClipFrom320x240_Linear (3149 ms)
[       OK ] libyuvTest.ARGBScaleDownClipBy3by8_Bilinear (3067 ms)
[       OK ] libyuvTest.TestFixedDiv (2970 ms)
[       OK ] libyuvTest.TestFixedDiv1_Opt (2970 ms)
[       OK ] libyuvTest.TestFixedDiv_Opt (2966 ms)
[       OK ] libyuvTest.ARGBScaleDownBy4_Box (2903 ms)
[       OK ] libyuvTest.ScaleTo1280x720_Bilinear (2869 ms)
[       OK ] libyuvTest.ScaleTo1280x720_Box (2852 ms)
[       OK ] libyuvTest.ARGBScaleDownBy8_Bilinear (2837 ms)
[       OK ] libyuvTest.ScaleTo1280x720_Linear (2825 ms)
[  FAILED  ] libyuvTest.ScaleDownBy3_Box (2764 ms)
[       OK ] libyuvTest.ARGBScaleDownBy8_Box (2744 ms)
[       OK ] libyuvTest.ARGBScaleClipFrom352x288_Bilinear (2629 ms)
[       OK ] libyuvTest.ARGBScaleClipFrom320x240_Bilinear (2512 ms)
[       OK ] libyuvTest.I420ToRGB565Dither_Any (2412 ms)
[       OK ] libyuvTest.I420ToRGB565Dither_Unaligned (2390 ms)
[       OK ] libyuvTest.I420ToRGB565Dither_Opt (2379 ms)
[       OK ] libyuvTest.I420ToRGB565Dither_Invert (2379 ms)
[       OK ] libyuvTest.ARGBScaleDownClipBy3by8_Linear (2148 ms)
[       OK ] libyuvTest.ARGBScaleClipFrom569x480_Linear (2141 ms)
[       OK ] libyuvTest.ARGBScaleTo1280x720_Bilinear (2138 ms)
[       OK ] libyuvTest.ARGBScaleDownClipBy3by4_Linear (2123 ms)
[       OK ] libyuvTest.ARGBToRGB565Dither_Invert (2040 ms)
[       OK ] libyuvTest.ARGBScaleTo1280x720_Linear (2038 ms)
[       OK ] libyuvTest.ARGBToRGB565Dither_Opt (2019 ms)
[       OK ] libyuvTest.ARGBToRGB565Dither_Unaligned (2017 ms)
[       OK ] libyuvTest.ARGBToRGB565Dither_Any (2007 ms)

Original comment by fbarch...@chromium.org on 7 Oct 2015 at 5:24

GoogleCodeExporter commented 8 years ago
     6.80%  libyuv_unittest  libyuv_unittest      [.] InterpolateRow_AVX2
     6.59%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBRowDownEvenBox_SSE2
     4.93%  libyuv_unittest  libyuv_unittest      [.] ScaleFilterCols_SSSE3
     4.57%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBFilterCols_SSSE3
     4.30%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBRowDown2Box_SSE2
     4.08%  libyuv_unittest  libyuv_unittest      [.] ScaleARGB
     3.99%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBCols_SSE2
     3.68%  libyuv_unittest  libyuv_unittest      [.] CopyRow_ERMS
     3.22%  libyuv_unittest  libyuv_unittest      [.] I422ToARGBRow_AVX2
     2.37%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBRowDownEven_SSE2
     2.15%  libyuv_unittest  libyuv_unittest      [.] FixedDiv_X86
     2.02%  libyuv_unittest  libyuv_unittest      [.] ARGBShuffleRow_AVX2
     2.00%  libyuv_unittest  libyuv_unittest      [.] CumulativeSumToAverageRow_SSE2
     1.89%  libyuv_unittest  libyuv_unittest      [.] ScaleCols_C
     1.87%  libyuv_unittest  libyuv_unittest      [.] ScaleAddRow_AVX2
     1.83%  libyuv_unittest  libyuv_unittest      [.] ScaleAddCols1_C
     1.68%  libyuv_unittest  libyuv_unittest      [.] I422ToABGRRow_AVX2
     1.25%  libyuv_unittest  libc-2.19.so         [.] _int_malloc
     1.25%  libyuv_unittest  libyuv_unittest      [.] libyuv::ARGBClipTestFilter(int, int, int, int, libyuv::FilterMode, int)
     1.23%  libyuv_unittest  libyuv_unittest      [.] ComputeCumulativeSumRow_SSE2
     1.15%  libyuv_unittest  libyuv_unittest      [.] ARGBToYRow_AVX2
     1.12%  libyuv_unittest  libyuv_unittest      [.] I422ToARGBRow_SSSE3
     0.91%  libyuv_unittest  libyuv_unittest      [.] SobelXRow_SSE2
     0.90%  libyuv_unittest  libyuv_unittest      [.] SobelYRow_SSE2
     0.88%  libyuv_unittest  libyuv_unittest      [.] ARGBToRGB565Row_SSE2
     0.85%  libyuv_unittest  libyuv_unittest      [.] TransposeWx8_Fast_SSSE3
     0.84%  libyuv_unittest  libyuv_unittest      [.] FixedDiv1_X86
     0.81%  libyuv_unittest  libyuv_unittest      [.] ScaleSlope
     0.74%  libyuv_unittest  libyuv_unittest      [.] next_marker
     0.74%  libyuv_unittest  libyuv_unittest      [.] I444ToARGBRow_SSSE3
     0.72%  libyuv_unittest  libyuv_unittest      [.] ARGBToUV411Row_C
     0.67%  libyuv_unittest  libyuv_unittest      [.] ARGBToARGB1555Row_SSE2
     0.65%  libyuv_unittest  libyuv_unittest      [.] ARGBScaleClip
     0.64%  libyuv_unittest  libyuv_unittest      [.] InterpolateRow_Any_AVX2
     0.62%  libyuv_unittest  libyuv_unittest      [.] ARGBToUVRow_AVX2
     0.61%  libyuv_unittest  libyuv_unittest      [.] ARGBToYJRow_AVX2
     0.58%  libyuv_unittest  libyuv_unittest      [.] ARGBToUV422Row_SSSE3
     0.57%  libyuv_unittest  libyuv_unittest      [.] I422ToBGRARow_AVX2
     0.56%  libyuv_unittest  libyuv_unittest      [.] I422ToRGBARow_AVX2
     0.53%  libyuv_unittest  libc-2.19.so         [.] _int_free
     0.49%  libyuv_unittest  libyuv_unittest      [.] libyuv::ARGBTestFilter(int, int, int, int, libyuv::FilterMode, int, int)
     0.49%  libyuv_unittest  libyuv_unittest      [.] ScaleRowDown38_3_Box_SSSE3
     0.49%  libyuv_unittest  libyuv_unittest      [.] ARGB1555ToARGBRow_SSE2
     0.45%  libyuv_unittest  libyuv_unittest      [.] ARGBUnattenuateRow_AVX2
     0.43%  libyuv_unittest  libyuv_unittest      [.] ARGBBlendRow_SSSE3
     0.42%  libyuv_unittest  libyuv_unittest      [.] ARGBToARGB4444Row_SSE2
     0.39%  libyuv_unittest  libyuv_unittest      [.] I411ToARGBRow_SSSE3
     0.39%  libyuv_unittest  libyuv_unittest      [.] ARGBToRGB565DitherRow_AVX2

Original comment by fbarch...@chromium.org on 8 Oct 2015 at 3:16

GoogleCodeExporter commented 8 years ago
r1513
  6.88%  libyuv_unittest  libyuv_unittest      [.] InterpolateRow_AVX2
  6.73%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBRowDownEvenBox_SSE2
  4.98%  libyuv_unittest  libyuv_unittest      [.] ScaleFilterCols_SSSE3
  4.69%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBFilterCols_SSSE3
  4.31%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBRowDown2Box_SSE2
  4.12%  libyuv_unittest  libyuv_unittest      [.] ScaleARGB
  3.90%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBCols_SSE2
  3.68%  libyuv_unittest  libyuv_unittest      [.] CopyRow_ERMS
  2.92%  libyuv_unittest  libyuv_unittest      [.] I422ToARGBRow_AVX2
  2.40%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBRowDownEven_SSE2
  2.22%  libyuv_unittest  libyuv_unittest      [.] FixedDiv_X86
  2.04%  libyuv_unittest  libyuv_unittest      [.] ARGBShuffleRow_AVX2
  1.97%  libyuv_unittest  libyuv_unittest      [.] CumulativeSumToAverageRow_SSE2
  1.86%  libyuv_unittest  libyuv_unittest      [.] ScaleAddRow_AVX2
  1.83%  libyuv_unittest  libyuv_unittest      [.] ScaleCols_C
  1.70%  libyuv_unittest  libyuv_unittest      [.] I422ToABGRRow_AVX2
  1.55%  libyuv_unittest  libyuv_unittest      [.] ScaleAddCols1_C
  1.44%  libyuv_unittest  libc-2.19.so         [.] _int_malloc
  1.27%  libyuv_unittest  libyuv_unittest      [.] libyuv::ARGBClipTestFilter(int, int, int, int, libyuv::FilterMode, int)
  1.27%  libyuv_unittest  libyuv_unittest      [.] ComputeCumulativeSumRow_SSE2
  1.16%  libyuv_unittest  libyuv_unittest      [.] ARGBToYRow_AVX2
  1.14%  libyuv_unittest  libyuv_unittest      [.] I422ToARGBRow_SSSE3
  0.93%  libyuv_unittest  libyuv_unittest      [.] SobelXRow_SSE2
  0.92%  libyuv_unittest  libyuv_unittest      [.] SobelYRow_SSE2
  0.89%  libyuv_unittest  libyuv_unittest      [.] ARGBToRGB565Row_SSE2
  0.85%  libyuv_unittest  libyuv_unittest      [.] TransposeWx8_Fast_SSSE3
  0.84%  libyuv_unittest  libyuv_unittest      [.] ScaleSlope
  0.80%  libyuv_unittest  libyuv_unittest      [.] FixedDiv1_X86
  0.75%  libyuv_unittest  libyuv_unittest      [.] next_marker
  0.74%  libyuv_unittest  libyuv_unittest      [.] I444ToARGBRow_SSSE3
  0.72%  libyuv_unittest  libyuv_unittest      [.] ARGBToUV411Row_C

Original comment by fbarch...@chromium.org on 16 Oct 2015 at 6:09

GoogleCodeExporter commented 8 years ago
On Arm, some performance numbers
I   31.227s run_tests_on_device(HT4A2JT03762)  [==========] Running 20 tests 
from 1 test case.
I   31.227s run_tests_on_device(HT4A2JT03762)  [----------] Global test 
environment set-up.
I   31.228s run_tests_on_device(HT4A2JT03762)  [----------] 20 tests from 
LibYUVConvertTest
I   31.228s run_tests_on_device(HT4A2JT03762)  [ RUN      ] 
LibYUVConvertTest.I420ToI420_Opt
I   31.228s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVConvertTest.I420ToI420_Opt (353 ms)
I   31.228s run_tests_on_device(HT4A2JT03762)  [ RUN      ] 
LibYUVConvertTest.I420ToI422_Opt
I   31.228s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVConvertTest.I420ToI422_Opt (407 ms)
I   31.228s run_tests_on_device(HT4A2JT03762)  [ RUN      ] 
LibYUVConvertTest.I420ToI444_Opt
I   31.228s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVConvertTest.I420ToI444_Opt (2681 ms)
I   31.228s run_tests_on_device(HT4A2JT03762)  [ RUN      ] 
LibYUVConvertTest.I420ToI411_Opt
I   31.228s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVConvertTest.I420ToI411_Opt (838 ms)
I   31.228s run_tests_on_device(HT4A2JT03762)  [ RUN      ] 
LibYUVConvertTest.I420ToI420Mirror_Opt
I   31.228s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVConvertTest.I420ToI420Mirror_Opt (423 ms)
I   31.228s run_tests_on_device(HT4A2JT03762)  [ RUN      ] 
LibYUVConvertTest.I420ToNV12_Opt
I   31.228s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVConvertTest.I420ToNV12_Opt (296 ms)
I   31.228s run_tests_on_device(HT4A2JT03762)  [ RUN      ] 
LibYUVConvertTest.I420ToNV21_Opt
I   31.228s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVConvertTest.I420ToNV21_Opt (275 ms)
I   31.229s run_tests_on_device(HT4A2JT03762)  [ RUN      ] 
LibYUVConvertTest.I420ToARGB_Opt
I   31.229s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVConvertTest.I420ToARGB_Opt (1480 ms)
I   31.229s run_tests_on_device(HT4A2JT03762)  [ RUN      ] 
LibYUVConvertTest.I420ToBGRA_Opt
I   31.229s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVConvertTest.I420ToBGRA_Opt (1490 ms)
I   31.229s run_tests_on_device(HT4A2JT03762)  [ RUN      ] 
LibYUVConvertTest.I420ToABGR_Opt
I   31.229s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVConvertTest.I420ToABGR_Opt (1465 ms)
I   31.229s run_tests_on_device(HT4A2JT03762)  [ RUN      ] 
LibYUVConvertTest.I420ToRGBA_Opt
I   31.229s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVConvertTest.I420ToRGBA_Opt (1509 ms)
I   31.229s run_tests_on_device(HT4A2JT03762)  [ RUN      ] 
LibYUVConvertTest.I420ToRAW_Opt
I   31.229s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVConvertTest.I420ToRAW_Opt (1576 ms)
I   31.229s run_tests_on_device(HT4A2JT03762)  [ RUN      ] 
LibYUVConvertTest.I420ToRGB24_Opt
I   31.229s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVConvertTest.I420ToRGB24_Opt (1651 ms)
I   31.229s run_tests_on_device(HT4A2JT03762)  [ RUN      ] 
LibYUVConvertTest.I420ToRGB565_Opt
I   31.229s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVConvertTest.I420ToRGB565_Opt (1563 ms)
I   31.229s run_tests_on_device(HT4A2JT03762)  [ RUN      ] 
LibYUVConvertTest.I420ToARGB1555_Opt
I   31.230s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVConvertTest.I420ToARGB1555_Opt (1566 ms)
I   31.230s run_tests_on_device(HT4A2JT03762)  [ RUN      ] 
LibYUVConvertTest.I420ToARGB4444_Opt
I   31.230s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVConvertTest.I420ToARGB4444_Opt (1533 ms)
I   31.230s run_tests_on_device(HT4A2JT03762)  [ RUN      ] 
LibYUVConvertTest.I420ToYUY2_Opt
I   31.230s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVConvertTest.I420ToYUY2_Opt (348 ms)
I   31.230s run_tests_on_device(HT4A2JT03762)  [ RUN      ] 
LibYUVConvertTest.I420ToUYVY_Opt
I   31.230s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVConvertTest.I420ToUYVY_Opt (350 ms)
I   31.230s run_tests_on_device(HT4A2JT03762)  [ RUN      ] 
LibYUVConvertTest.I420ToI400_Opt
I   31.230s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVConvertTest.I420ToI400_Opt (149 ms)
I   31.230s run_tests_on_device(HT4A2JT03762)  [ RUN      ] 
LibYUVConvertTest.I420ToRGB565Dither_Opt
I   31.230s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVConvertTest.I420ToRGB565Dither_Opt (1962 ms)
I   31.230s run_tests_on_device(HT4A2JT03762)  [----------] 20 tests from 
LibYUVConvertTest (21920 ms total)
I   31.230s run_tests_on_device(HT4A2JT03762)  
I   31.230s run_tests_on_device(HT4A2JT03762)  [----------] Global test 
environment tear-down
I   31.230s run_tests_on_device(HT4A2JT03762)  [==========] 20 tests from 1 
test case ran. (21924 ms total)
I   31.230s run_tests_on_device(HT4A2JT03762)  [  PASSED  ] 20 tests.

Original comment by fbarch...@google.com on 18 Oct 2015 at 7:30

GoogleCodeExporter commented 8 years ago
LibYUVScaleTest.ScaleDownBy8_Box (200681 ms)
LibYUVBaseTest.TestFixedDiv1_Opt (194044 ms)
LibYUVBaseTest.TestFixedDiv_Opt (191014 ms)
LibYUVBaseTest.TestFixedDiv (104787 ms)
LibYUVConvertTest.I420AlphaToARGB_Premult (77882 ms)
LibYUVConvertTest.I420AlphaToABGR_Premult (77382 ms)
LibYUVConvertTest.I444ToABGR_Unaligned (77309 ms)
LibYUVConvertTest.I444ToABGR_Invert (77230 ms)
LibYUVConvertTest.I444ToABGR_Any (77130 ms)
LibYUVConvertTest.I444ToABGR_Opt (77053 ms)
LibYUVConvertTest.I420AlphaToARGB_Invert (76718 ms)
LibYUVConvertTest.I420AlphaToARGB_Unaligned (76620 ms)
LibYUVConvertTest.I420AlphaToARGB_Opt (76488 ms)
LibYUVConvertTest.I420AlphaToARGB_Any (76200 ms)
LibYUVConvertTest.I420AlphaToABGR_Opt (74689 ms)
LibYUVConvertTest.I420AlphaToABGR_Any (74684 ms)
LibYUVConvertTest.I420AlphaToABGR_Invert (74532 ms)
LibYUVConvertTest.I420AlphaToABGR_Unaligned (74470 ms)
LibYUVPlanarTest.TestARGBPolynomial (67153 ms)
LibYUVScaleTest.ARGBScaleDownClipBy4_Box (48645 ms)
LibYUVScaleTest.ScaleDownBy3_Box (45322 ms)
LibYUVScaleTest.ARGBScaleDownClipBy3by8_Bilinear (43125 ms)
LibYUVScaleTest.ARGBScaleDownClipBy3by8_Box (42043 ms)
LibYUVRotateTest.ARGBRotate270 (39778 ms)
LibYUVScaleTest.ARGBScaleDownClipBy8_Bilinear (39699 ms)
LibYUVRotateTest.ARGBRotate90 (39674 ms)
LibYUVScaleTest.ARGBScaleDownClipBy8_Box (38420 ms)
LibYUVScaleTest.ARGBScaleDownClipBy3by4_Box (36128 ms)
LibYUVScaleTest.ARGBScaleDownClipBy3by4_Bilinear (35155 ms)
LibYUVScaleTest.ARGBScaleDownBy4_Box (34227 ms)
LibYUVPlanarTest.ARGBBlur_Invert (30982 ms)
LibYUVPlanarTest.ARGBBlur_Any (30886 ms)
LibYUVPlanarTest.ARGBBlur_Unaligned (30757 ms)
LibYUVPlanarTest.ARGBBlur_Opt (30696 ms)
LibYUVScaleTest.ARGBScaleClipFrom569x480_Bilinear (29419 ms)
LibYUVScaleTest.ARGBScaleDownClipBy3by8_Linear (29374 ms)
LibYUVScaleTest.ARGBScaleClipFrom640x360_Bilinear (28262 ms)
LibYUVScaleTest.ARGBScaleDownClipBy3by4_Linear (27859 ms)
LibYUVScaleTest.ARGBScaleDownClipBy8_Linear (27853 ms)
LibYUVPlanarTest.ARGBBlurSmall_Invert (27040 ms)
LibYUVPlanarTest.ARGBBlurSmall_Any (26982 ms)
LibYUVPlanarTest.ARGBBlurSmall_Opt (26738 ms)
LibYUVPlanarTest.ARGBBlurSmall_Unaligned (26735 ms)
LibYUVScaleTest.ARGBScaleClipFrom320x240_Bilinear (26602 ms)
LibYUVScaleTest.ARGBScaleClipFrom352x288_Bilinear (26565 ms)
LibYUVScaleTest.ARGBScaleClipFrom569x480_Linear (26202 ms)
LibYUVScaleTest.ARGBScaleDownClipBy4_Bilinear (25372 ms)
LibYUVScaleTest.ARGBScaleClipFrom640x360_Linear (24780 ms)
LibYUVScaleTest.ARGBScaleClipFrom352x288_Linear (24535 ms)
LibYUVScaleTest.ARGBScaleDownClipBy8_None (24087 ms)
LibYUVScaleTest.ARGBScaleClipFrom320x240_Linear (23947 ms)
LibYUVScaleTest.ARGBScaleDownClipBy3_Box (23095 ms)
LibYUVScaleTest.ARGBScaleDownClipBy3_Linear (23081 ms)
LibYUVScaleTest.ARGBScaleDownClipBy3_Bilinear (22962 ms)
LibYUVScaleTest.ARGBScaleDownClipBy3_None (22827 ms)
LibYUVScaleTest.ARGBScaleDownClipBy4_Linear (22667 ms)
LibYUVScaleTest.ARGBScaleDownClipBy2_Bilinear (22186 ms)
LibYUVScaleTest.ARGBScaleDownClipBy2_Box (22131 ms)
LibYUVScaleTest.ARGBScaleDownClipBy2_Linear (20082 ms)
LibYUVScaleTest.ARGBScaleDownClipBy4_None (19863 ms)
LibYUVScaleTest.ARGBScaleDownClipBy3by8_None (19275 ms)
LibYUVScaleTest.ARGBScaleDownClipBy3by4_None (17790 ms)
LibYUVScaleTest.ARGBScaleClipFrom569x480_None (17676 ms)
LibYUVScaleTest.ARGBScaleClipFrom1280x720_Bilinear (17553 ms)
LibYUVScaleTest.ARGBScaleClipTo1280x720_Bilinear (17533 ms)
LibYUVScaleTest.ARGBScaleClipTo1280x720_None (17471 ms)
LibYUVScaleTest.ARGBScaleClipFrom1280x720_Linear (17440 ms)
LibYUVScaleTest.ARGBScaleClipTo1280x720_Linear (17425 ms)
LibYUVScaleTest.ARGBScaleClipFrom1280x720_None (17398 ms)
LibYUVScaleTest.ARGBScaleClipFrom640x360_None (17378 ms)
LibYUVScaleTest.ARGBScaleDownClipBy2_None (16051 ms)
LibYUVScaleTest.ARGBScaleClipFrom352x288_None (15855 ms)
LibYUVScaleTest.ARGBScaleDownBy8_Box (15648 ms)
LibYUVScaleTest.ARGBScaleClipFrom320x240_None (15468 ms)
LibYUVScaleTest.ARGBScaleDownBy8_Bilinear (15367 ms)
LibYUVScaleTest.ARGBScaleClipFrom1x1_None (14451 ms)
LibYUVScaleTest.ARGBScaleClipFrom1x1_Bilinear (14392 ms)
LibYUVScaleTest.ARGBScaleClipFrom1x1_Linear (14313 ms)
LibYUVScaleTest.ARGBScaleClipTo569x480_Bilinear (11779 ms)
LibYUVScaleTest.ARGBScaleDownBy8_Linear (11699 ms)
LibYUVScaleTest.ScaleDownBy8_Bilinear (11674 ms)
LibYUVScaleTest.ARGBScaleDownBy3by8_Box (11625 ms)
LibYUVScaleTest.ARGBScaleDownBy3by8_Bilinear (11595 ms)
LibYUVPlanarTest.TestARGBLumaColorTable (11480 ms)
LibYUVScaleTest.ARGBScaleDownBy4_Bilinear (10768 ms)
LibYUVRotateTest.ARGBRotate270_Odd (10471 ms)
LibYUVRotateTest.ARGBRotate90_Odd (10401 ms)
LibYUVPlanarTest.TestARGBColorTable (9869 ms)
LibYUVConvertTest.ARGBToUYVY_Opt (9293 ms)
LibYUVScaleTest.ScaleDownBy4_Bilinear (9191 ms)
LibYUVConvertTest.ARGBToUYVY_Any (8985 ms)
LibYUVScaleTest.ARGBScaleClipTo569x480_Linear (8688 ms)
LibYUVConvertTest.ARGBToUYVY_Unaligned (8364 ms)
LibYUVConvertTest.ARGBToYUY2_Opt (8320 ms)
LibYUVPlanarTest.ARGBUnattenuate_Invert (8220 ms)
LibYUVScaleTest.ARGBScaleDownBy4_Linear (8118 ms)
LibYUVPlanarTest.ARGBUnattenuate_Opt (8066 ms)
LibYUVScaleTest.ARGBScaleDownBy8_None (7959 ms)
LibYUVScaleTest.ARGBScaleDownBy3by4_Bilinear (7814 ms)
LibYUVScaleTest.ARGBScaleDownBy3by4_Box (7807 ms)
LibYUVPlanarTest.ARGBUnattenuate_Any (7779 ms)
LibYUVScaleTest.ScaleDownBy8_Linear (7776 ms)
LibYUVPlanarTest.ARGBUnattenuate_Unaligned (7774 ms)
LibYUVConvertTest.ARGBToYUY2_Unaligned (7707 ms)
LibYUVConvertTest.ARGBToYUY2_Any (7673 ms)
LibYUVScaleTest.ARGBScaleDownBy2_Bilinear (7508 ms)
LibYUVScaleTest.ScaleDownBy4_Box (7434 ms)
LibYUVScaleTest.ARGBScaleDownBy2_Box (7395 ms)
LibYUVConvertTest.I420ToI444_Any (7304 ms)
LibYUVConvertTest.I420ToI444_Opt (7142 ms)
LibYUVScaleTest.ScaleDownBy4_Linear (7117 ms)
LibYUVScaleTest.ScaleDownBy3by8_Linear (7054 ms)
LibYUVScaleTest.ScaleDownBy3by8_Bilinear (7011 ms)
LibYUVScaleTest.ScaleDownBy3by8_Box (7003 ms)
LibYUVConvertTest.I420ToI444_Invert (6846 ms)
LibYUVPlanarTest.TestRGBColorTable (6813 ms)
LibYUVConvertTest.I420ToI444_Unaligned (6799 ms)
LibYUVScaleTest.ScaleTo352x288_Box (6198 ms)
LibYUVPlanarTest.ARGBSobelXY_Any (6115 ms)
LibYUVPlanarTest.ARGBSobel_Any (6105 ms)
LibYUVPlanarTest.ARGBSobel_Invert (5977 ms)
LibYUVColorTest.TestFullYUV (5914 ms)
LibYUVPlanarTest.ARGBSobelXY_Invert (5894 ms)
LibYUVPlanarTest.ARGBSobel_Opt (5847 ms)
LibYUVPlanarTest.ARGBSobelXY_Opt (5813 ms)
LibYUVPlanarTest.ARGBSobel_Unaligned (5799 ms)
LibYUVPlanarTest.ARGBSobelXY_Unaligned (5720 ms)
LibYUVScaleTest.ARGBScaleFrom569x480_Bilinear (5595 ms)
LibYUVColorTest.TestFullYUVJ (5560 ms)
LibYUVScaleTest.ARGBScaleClipTo352x288_Bilinear (5557 ms)
LibYUVPlanarTest.ARGBSobelToPlane_Any (5465 ms)
LibYUVScaleTest.ScaleFrom569x480_Bilinear (5335 ms)
LibYUVScaleTest.ScaleFrom569x480_Box (5328 ms)
LibYUVPlanarTest.ARGBSobelToPlane_Invert (5157 ms)
LibYUVPlanarTest.ARGBSobelToPlane_Opt (5074 ms)
LibYUVPlanarTest.ARGBAdd_Unaligned (5072 ms)
LibYUVScaleTest.ScaleDownBy8_None (5009 ms)
LibYUVScaleTest.ARGBScaleDownBy2_Linear (5008 ms)
LibYUVPlanarTest.ARGBSobelToPlane_Unaligned (4996 ms)
LibYUVPlanarTest.ARGBSubtract_Unaligned (4911 ms)
LibYUVScaleTest.ARGBScaleClipTo569x480_None (4887 ms)
LibYUVConvertTest.I420ToRGB565Dither_Any (4884 ms)
LibYUVScaleTest.ARGBScaleFrom640x360_Bilinear (4882 ms)
LibYUVScaleTest.ARGBScaleDownBy3by8_Linear (4882 ms)
LibYUVScaleTest.ScaleFrom569x480_Linear (4854 ms)
LibYUVConvertTest.ARGBToI444_Unaligned (4674 ms)
LibYUVScaleTest.ARGBScaleClipTo640x360_Bilinear (4604 ms)
LibYUVPlanarTest.ARGBSubtract_Invert (4598 ms)
LibYUVPlanarTest.ARGBAdd_Invert (4593 ms)
LibYUVPlanarTest.ARGBAdd_Opt (4588 ms)
LibYUVPlanarTest.ARGBSubtract_Opt (4570 ms)
LibYUVScaleTest.ARGBScaleDownBy4_None (4535 ms)
LibYUVScaleTest.ARGBScaleDownBy3by4_Linear (4532 ms)
LibYUVScaleTest.ScaleTo320x240_Box (4504 ms)
LibYUVConvertTest.I420ToRGB565Dither_Unaligned (4447 ms)
LibYUVPlanarTest.ARGBMultiply_Opt (4427 ms)
LibYUVScaleTest.ScaleDownBy3_Bilinear (4426 ms)
LibYUVConvertTest.RGB565ToI420_Any (4387 ms)
LibYUVPlanarTest.ARGBMultiply_Invert (4349 ms)
LibYUVConvertTest.ARGB1555ToI420_Any (4340 ms)
LibYUVScaleTest.ScaleDownBy3_None (4336 ms)
LibYUVConvertTest.I420ToRGB565Dither_Invert (4330 ms)
LibYUVScaleTest.ScaleDownBy3_Linear (4322 ms)
LibYUVConvertTest.I420ToRGB565Dither_Opt (4307 ms)
LibYUVScaleTest.ARGBScaleFrom352x288_Bilinear (4271 ms)
LibYUVScaleTest.ARGBScaleClipTo640x360_Linear (4241 ms)
LibYUVScaleTest.ScaleFrom640x360_Box (4232 ms)
LibYUVScaleTest.ScaleFrom640x360_Bilinear (4231 ms)
LibYUVConvertTest.I444ToARGB_Unaligned (4229 ms)
LibYUVConvertTest.I420ToARGB1555_Any (4214 ms)
LibYUVConvertTest.ARGBToI444_Opt (4122 ms)
LibYUVConvertTest.I411ToARGB_Unaligned (4090 ms)
LibYUVScaleTest.ARGBScaleFrom569x480_Linear (4083 ms)
LibYUVConvertTest.ARGB4444ToI420_Any (4028 ms)
LibYUVScaleTest.ARGBScaleFrom320x240_Bilinear (4026 ms)
LibYUVConvertTest.RGB565ToI420_Unaligned (4026 ms)
LibYUVConvertTest.I420ToRAW_Any (4023 ms)
LibYUVConvertTest.ARGB1555ToI420_Unaligned (4018 ms)
LibYUVPlanarTest.ARGBMultiply_Unaligned (3990 ms)
LibYUVConvertTest.UYVYToARGB_Invert (3979 ms)
LibYUVConvertTest.I444ToARGB_Opt (3974 ms)
LibYUVConvertTest.J444ToARGB_Opt (3967 ms)
LibYUVConvertTest.I420ToARGB_Unaligned (3966 ms)
LibYUVConvertTest.J420ToARGB_Any (3965 ms)
LibYUVConvertTest.J444ToARGB_Any (3959 ms)
LibYUVConvertTest.ARGBToI444_Any (3953 ms)
LibYUVConvertTest.UYVYToARGB_Unaligned (3940 ms)
LibYUVConvertTest.I420ToBGRA_Any (3925 ms)
LibYUVConvertTest.YUY2ToARGB_Invert (3923 ms)
LibYUVConvertTest.NV12ToRGB565_Any (3922 ms)
LibYUVConvertTest.I420ToRGBA_Any (3918 ms)
LibYUVConvertTest.H420ToARGB_Unaligned (3914 ms)
LibYUVRotateTest.ARGBRotate180_Odd (3889 ms)
LibYUVConvertTest.ARGB1555ToI420_Invert (3859 ms)
LibYUVConvertTest.J420ToABGR_Any (3850 ms)
LibYUVConvertTest.J444ToARGB_Unaligned (3849 ms)
LibYUVConvertTest.RGB565ToI420_Invert (3844 ms)
LibYUVConvertTest.H420ToABGR_Unaligned (3836 ms)
LibYUVConvertTest.ARGB1555ToI420_Opt (3833 ms)
LibYUVConvertTest.RGB565ToI420_Opt (3828 ms)
LibYUVConvertTest.I420ToRGB24_Any (3820 ms)
LibYUVConvertTest.NV12ToARGB_Any (3817 ms)
LibYUVPlanarTest.TestRGBColorMatrix (3815 ms)
LibYUVConvertTest.NV21ToARGB_Any (3799 ms)

Original comment by fbarch...@chromium.org on 24 Oct 2015 at 12:19

GoogleCodeExporter commented 8 years ago
LIBYUV_FLAGS=-1 LIBYUV_WIDTH=1280 LIBYUV_HEIGHT=720 LIBYUV_REPEAT=999 perf 
record out/Release/libyuv_unittest
perf report
  8.94%  libyuv_unittest  libyuv_unittest      [.] I422ToARGBRow_AVX2
  6.29%  libyuv_unittest  libyuv_unittest      [.] InterpolateRow_AVX2
  6.00%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBRowDownEvenBox_SSE2
  4.51%  libyuv_unittest  libyuv_unittest      [.] ScaleFilterCols_SSSE3
  4.28%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBFilterCols_SSSE3
  3.94%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBRowDown2Box_SSE2
  3.71%  libyuv_unittest  libyuv_unittest      [.] ScaleARGB
  3.54%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBCols_SSE2
  3.37%  libyuv_unittest  libyuv_unittest      [.] CopyRow_ERMS
  2.17%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBRowDownEven_SSE2
  2.09%  libyuv_unittest  libyuv_unittest      [.] I444ToARGBRow_SSSE3
  2.03%  libyuv_unittest  libyuv_unittest      [.] FixedDiv_X86
  1.98%  libyuv_unittest  libyuv_unittest      [.] I422ToRGBARow_AVX2
  1.85%  libyuv_unittest  libyuv_unittest      [.] ARGBShuffleRow_AVX2
  1.79%  libyuv_unittest  libyuv_unittest      [.] CumulativeSumToAverageRow_SSE2
  1.74%  libyuv_unittest  libyuv_unittest      [.] ScaleCols_C
  1.72%  libyuv_unittest  libyuv_unittest      [.] ScaleAddRow_AVX2
  1.41%  libyuv_unittest  libyuv_unittest      [.] ScaleAddCols1_C
  1.36%  libyuv_unittest  libyuv_unittest      [.] I422AlphaToARGBRow_AVX2
  1.14%  libyuv_unittest  libyuv_unittest      [.] ComputeCumulativeSumRow_SSE2
  1.14%  libyuv_unittest  libyuv_unittest      [.] libyuv::ARGBClipTestFilter(int, int, int, int, libyuv::FilterMode, int)
  1.09%  libyuv_unittest  libc-2.19.so         [.] _int_malloc
  1.04%  libyuv_unittest  libyuv_unittest      [.] I422ToARGBRow_SSSE3
  1.04%  libyuv_unittest  libyuv_unittest      [.] ARGBToYRow_AVX2
  0.91%  libyuv_unittest  libyuv_unittest      [.] ARGBToRGB24Row_SSSE3
  0.84%  libyuv_unittest  libyuv_unittest      [.] SobelXRow_SSE2
  0.84%  libyuv_unittest  libyuv_unittest      [.] SobelYRow_SSE2
  0.80%  libyuv_unittest  libyuv_unittest      [.] ARGBToRGB565Row_SSE2
  0.78%  libyuv_unittest  libyuv_unittest      [.] TransposeWx8_Fast_SSSE3
  0.77%  libyuv_unittest  libyuv_unittest      [.] FixedDiv1_X86
  0.74%  libyuv_unittest  libyuv_unittest      [.] ScaleSlope
  0.71%  libyuv_unittest  libyuv_unittest      [.] I411ToARGBRow_SSSE3
  0.68%  libyuv_unittest  libyuv_unittest      [.] next_marker
  0.66%  libyuv_unittest  libyuv_unittest      [.] ARGBToUV411Row_C
  0.62%  libyuv_unittest  libyuv_unittest      [.] ARGBToARGB1555Row_SSE2
  0.59%  libyuv_unittest  libyuv_unittest      [.] ARGBToUVRow_AVX2
  0.58%  libyuv_unittest  libyuv_unittest      [.] ARGBScaleClip
  0.58%  libyuv_unittest  libyuv_unittest      [.] InterpolateRow_Any_AVX2
  0.55%  libyuv_unittest  libyuv_unittest      [.] ARGBToYJRow_AVX2
  0.54%  libyuv_unittest  libyuv_unittest      [.] ARGBToUV422Row_SSSE3
  0.48%  libyuv_unittest  libc-2.19.so         [.] _int_free
  0.45%  libyuv_unittest  libyuv_unittest      [.] ScaleRowDown38_3_Box_SSSE3
  0.45%  libyuv_unittest  libyuv_unittest      [.] libyuv::ARGBTestFilter(int, int, int, int, libyuv::FilterMode, int, int, int)

Original comment by fbarch...@chromium.org on 10 Nov 2015 at 7:42

GoogleCodeExporter commented 8 years ago
I444ToARGBRow_SSSE3 needs AVX2 port.

SSSE3
Note: Google Test filter = *I444ToARGB*
[==========] Running 8 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 8 tests from LibYUVConvertTest
[ RUN      ] LibYUVConvertTest.I444ToARGB_Any
[       OK ] LibYUVConvertTest.I444ToARGB_Any (435 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_Unaligned
[       OK ] LibYUVConvertTest.I444ToARGB_Unaligned (418 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_Invert
[       OK ] LibYUVConvertTest.I444ToARGB_Invert (417 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_Opt
[       OK ] LibYUVConvertTest.I444ToARGB_Opt (411 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_ARGB_Any
[       OK ] LibYUVConvertTest.I444ToARGB_ARGB_Any (419 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_ARGB_Unaligned
[       OK ] LibYUVConvertTest.I444ToARGB_ARGB_Unaligned (432 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_ARGB_Invert
[       OK ] LibYUVConvertTest.I444ToARGB_ARGB_Invert (435 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_ARGB_Opt
[       OK ] LibYUVConvertTest.I444ToARGB_ARGB_Opt (421 ms)
[----------] 8 tests from LibYUVConvertTest (3389 ms total)

AVX2
Note: Google Test filter = *I444ToARGB*
[==========] Running 8 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 8 tests from LibYUVConvertTest
[ RUN      ] LibYUVConvertTest.I444ToARGB_Any
[       OK ] LibYUVConvertTest.I444ToARGB_Any (340 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_Unaligned
[       OK ] LibYUVConvertTest.I444ToARGB_Unaligned (325 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_Invert
[       OK ] LibYUVConvertTest.I444ToARGB_Invert (316 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_Opt
[       OK ] LibYUVConvertTest.I444ToARGB_Opt (316 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_ARGB_Any
[       OK ] LibYUVConvertTest.I444ToARGB_ARGB_Any (315 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_ARGB_Unaligned
[       OK ] LibYUVConvertTest.I444ToARGB_ARGB_Unaligned (341 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_ARGB_Invert
[       OK ] LibYUVConvertTest.I444ToARGB_ARGB_Invert (331 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_ARGB_Opt
[       OK ] LibYUVConvertTest.I444ToARGB_ARGB_Opt (329 ms)
[----------] 8 tests from LibYUVConvertTest (2615 ms total)

Original comment by fbarch...@chromium.org on 14 Nov 2015 at 2:29

GoogleCodeExporter commented 8 years ago
The following revision refers to this bug:
  https://chromium.googlesource.com/libyuv/libyuv.git/+/1019e4537fc1bfc6ee505cd1c628b645c7e966b7

commit 1019e4537fc1bfc6ee505cd1c628b645c7e966b7
Author: Frank Barchard <fbarchard@google.com>
Date: Sat Nov 14 02:31:22 2015

port I444ToARGB avx2 code from Visual C to GCC.

SSSE3
Note: Google Test filter = *I444ToARGB*
[==========] Running 8 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 8 tests from LibYUVConvertTest
[ RUN      ] LibYUVConvertTest.I444ToARGB_Any
[       OK ] LibYUVConvertTest.I444ToARGB_Any (435 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_Unaligned
[       OK ] LibYUVConvertTest.I444ToARGB_Unaligned (418 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_Invert
[       OK ] LibYUVConvertTest.I444ToARGB_Invert (417 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_Opt
[       OK ] LibYUVConvertTest.I444ToARGB_Opt (411 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_ARGB_Any
[       OK ] LibYUVConvertTest.I444ToARGB_ARGB_Any (419 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_ARGB_Unaligned
[       OK ] LibYUVConvertTest.I444ToARGB_ARGB_Unaligned (432 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_ARGB_Invert
[       OK ] LibYUVConvertTest.I444ToARGB_ARGB_Invert (435 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_ARGB_Opt
[       OK ] LibYUVConvertTest.I444ToARGB_ARGB_Opt (421 ms)
[----------] 8 tests from LibYUVConvertTest (3389 ms total)

AVX2
Note: Google Test filter = *I444ToARGB*
[==========] Running 8 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 8 tests from LibYUVConvertTest
[ RUN      ] LibYUVConvertTest.I444ToARGB_Any
[       OK ] LibYUVConvertTest.I444ToARGB_Any (340 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_Unaligned
[       OK ] LibYUVConvertTest.I444ToARGB_Unaligned (325 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_Invert
[       OK ] LibYUVConvertTest.I444ToARGB_Invert (316 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_Opt
[       OK ] LibYUVConvertTest.I444ToARGB_Opt (316 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_ARGB_Any
[       OK ] LibYUVConvertTest.I444ToARGB_ARGB_Any (315 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_ARGB_Unaligned
[       OK ] LibYUVConvertTest.I444ToARGB_ARGB_Unaligned (341 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_ARGB_Invert
[       OK ] LibYUVConvertTest.I444ToARGB_ARGB_Invert (331 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_ARGB_Opt
[       OK ] LibYUVConvertTest.I444ToARGB_ARGB_Opt (329 ms)
[----------] 8 tests from LibYUVConvertTest (2615 ms total)

TBR=harryjin@google.com
BUG=libyuv:492

Review URL: https://codereview.chromium.org/1445893002 .

[modify] 
http://crrev.com/1019e4537fc1bfc6ee505cd1c628b645c7e966b7/include/libyuv/row.h
[modify] 
http://crrev.com/1019e4537fc1bfc6ee505cd1c628b645c7e966b7/source/row_gcc.cc

Original comment by bugdroid1@chromium.org on 14 Nov 2015 at 2:32

GoogleCodeExporter commented 8 years ago
util/android/test_runner.py gtest -s libyuv_unittest -t 7200 --verbose 
--release --gtest_filter=* -a "--libyuv_width=1280 --libyuv_height=720 
--libyuv_repeat=999 --libyuv
_flags=-1" | grep ms | sed 's/\(.*(\)\([0-9]*\)\( ms)\)/\2 - \1\2\3/g' | sort 
-rn | sed 's/.*- \(.*\)/\1/g'
I 3385.631s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVScaleTest.ScaleDownBy8_Box (212336 ms)
I 1687.270s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVPlanarTest.TestARGBPolynomial (62884 ms)
I 3385.632s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVScaleTest.ScaleDownBy3_Box (45134 ms)
I 3385.620s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVScaleTest.ARGBScaleDownClipBy3by8_Box (41680 ms)
I 3385.616s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVScaleTest.ARGBScaleDownClipBy4_Box (40355 ms)
I 3385.620s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVScaleTest.ARGBScaleDownClipBy3by8_Bilinear (39277 ms)
I 1687.272s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVRotateTest.ARGBRotate270 (37779 ms)
I 1687.271s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVRotateTest.ARGBRotate90 (37493 ms)
I 3385.617s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVScaleTest.ARGBScaleDownClipBy8_Bilinear (35383 ms)
I 3385.618s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVScaleTest.ARGBScaleDownClipBy8_Box (35314 ms)
I 3385.619s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVScaleTest.ARGBScaleDownClipBy3by4_Box (32276 ms)
I 3385.619s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVScaleTest.ARGBScaleDownClipBy3by4_Bilinear (32001 ms)
I 1687.270s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVPlanarTest.ARGBBlur_Invert (31007 ms)
I 1687.270s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVPlanarTest.ARGBBlur_Opt (30818 ms)
I 1687.270s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVPlanarTest.ARGBBlur_Unaligned (30766 ms)
I 1687.270s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVPlanarTest.ARGBBlur_Any (30736 ms)
I 3385.616s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVScaleTest.ARGBScaleDownBy4_Box (29546 ms)
I 1687.270s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVPlanarTest.ARGBBlurSmall_Invert (27381 ms)
I 1687.270s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVPlanarTest.ARGBBlurSmall_Unaligned (27267 ms)
I 1687.270s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVPlanarTest.ARGBBlurSmall_Any (27204 ms)
I 1687.270s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVPlanarTest.ARGBBlurSmall_Opt (27136 ms)
I 3385.619s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVScaleTest.ARGBScaleDownClipBy3by8_Linear (25732 ms)
I 3385.627s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVScaleTest.ARGBScaleClipFrom569x480_Bilinear (25521 ms)
I 3385.620s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVScaleTest.ARGBScaleDownClipBy3_Linear (25312 ms)
I 3385.618s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVScaleTest.ARGBScaleDownClipBy3by4_Linear (24994 ms)
I 3385.621s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVScaleTest.ARGBScaleDownClipBy3_Bilinear (24767 ms)
I 3385.621s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVScaleTest.ARGBScaleDownClipBy3_Box (24655 ms)
I 3385.620s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVScaleTest.ARGBScaleDownClipBy3_None (24416 ms)
I 3385.628s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVScaleTest.ARGBScaleClipFrom640x360_Bilinear (24332 ms)
I 3385.625s run_tests_on_device(HT4A2JT03762)  [       OK ] 
LibYUVScaleTest.ARGBScaleClipFrom352x288_Bilinear (22861 ms)

Original comment by fbarch...@chromium.org on 14 Nov 2015 at 2:39

GoogleCodeExporter commented 8 years ago
LIBYUV_FLAGS=-1 LIBYUV_WIDTH=1280 LIBYUV_HEIGHT=720 LIBYUV_REPEAT=999 perf 
record out/Release/libyuv_unittest --gtest_filter=*I444ToARGB*

AVX2 I444ToARGB_Opt (315 ms)
SSSE3 I444ToARGB_Opt (408 ms)
C I444ToARGB_Opt (4329 ms)

Original comment by fbarch...@chromium.org on 14 Nov 2015 at 2:56

GoogleCodeExporter commented 8 years ago
The following revision refers to this bug:
  https://chromium.googlesource.com/libyuv/libyuv.git/+/0815568a502c509970cd1177ed6f908305adcaa0

commit 0815568a502c509970cd1177ed6f908305adcaa0
Author: Frank Barchard <fbarchard@google.com>
Date: Tue Nov 17 08:04:03 2015

test for unaligned vs aligned for CopyRow_SSE2

improves performance on older CPUs where movdqa is faster.
TBR=harryjin@google.com
BUG=libyuv:492

Review URL: https://codereview.chromium.org/1455463002 .

[modify] 
http://crrev.com/0815568a502c509970cd1177ed6f908305adcaa0/README.chromium
[modify] 
http://crrev.com/0815568a502c509970cd1177ed6f908305adcaa0/include/libyuv/version
.h
[modify] 
http://crrev.com/0815568a502c509970cd1177ed6f908305adcaa0/source/row_gcc.cc
[modify] 
http://crrev.com/0815568a502c509970cd1177ed6f908305adcaa0/source/row_win.cc

Original comment by bugdroid1@chromium.org on 17 Nov 2015 at 8:04