hushuitian / libyuv

Automatically exported from code.google.com/p/libyuv
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

Odd width performance #431

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
The following are functions most affected by odd width:
Function                odd even    odd/even
ScaleDownBy3by8_Box (2942 ms)       2942    276 1065.94%
ScaleDownBy4_Box (1218 ms)      1218    136 895.59%
ScaleDownBy3by4_Box (1837 ms)       1837    291 631.27%
ScaleDownBy3by4_Bilinear (1834 ms)  1834    293 625.94%
ARGBScaleDownBy2_Bilinear (2528 ms) 2528    410 616.59%
ScaleDownBy3by4_Linear (1557 ms)    1557    278 560.07%
ARGBScaleDownBy2_Linear (1626 ms)   1626    334 486.83%
ScaleDownBy3by4_None (842 ms)       842 199 423.12%
ScaleDownBy2_Bilinear (441 ms)      441 170 259.41%
ScaleDownBy2_Box (417 ms)       417 169 246.75%
ScaleDownBy2_Linear (345 ms)        345 140 246.43%
ScaleDownBy3by8_None (244 ms)       244 100 244.00%
ScaleDownBy3by8_Bilinear (562 ms)   562 278 202.16%
ScaleDownBy2_None (221 ms)      221 116 190.52%
ARGBSobel_Opt (5004 ms)         5004    2699    185.40%
ARGBSobelToPlane_Opt (3692 ms)      3692    2025    182.32%
ARGBScaleDownBy4_Bilinear (406 ms)  406 223 182.06%
ARGBScaleDownClipBy4_Bilinear (929 ms)  929 512 181.45%
ARGBSobelXY_Opt (4786 ms)       4786    2660    179.92%
ScaleTo1x1_Box (498 ms)         498 277 179.78%
ARGBSobelToPlane_Unaligned (3712 ms)    3712    2081    178.38%
ARGBSobelToPlane_Invert (3720 ms)   3720    2106    176.64%
ScaleDownBy4_None (129 ms)      129 74  174.32%
ARGBSobelXY_Unaligned (4806 ms)     4806    2781    172.82%
ARGBSobel_Invert (4687 ms)      4687    2719    172.38%
ARGBSobel_Unaligned (4658 ms)       4658    2722    171.12%
ScaleTo640x360_Linear (345 ms)      345 202 170.79%
ScaleTo640x360_Bilinear (344 ms)    344 202 170.30%
ARGBSobelXY_Invert (4802 ms)        4802    2843    168.91%
ScaleDownBy3by8_Linear (429 ms)     429 269 159.48%
I420ToI422_Invert (529 ms)      529 338 156.51%
ARGBScaleClipTo640x360_Linear (911 ms)  911 611 149.10%
ARGBScaleDownClipBy4_Linear (622 ms)    622 424 146.70%
ARGBInterpolate128_Opt (2584 ms)    2584    1777    145.41%
ARGBInterpolate192_Any (2637 ms)    2637    1835    143.71%
ScaleTo640x360_Box (2120 ms)        2120    1487    142.57%
I420ToI411_Invert (1054 ms)     1054    763 138.14%
ARGBInterpolate192_Opt (2873 ms)    2873    2135    134.57%
ARGBScaleTo640x360_Bilinear (296 ms)    296 221 133.94%
ARGBScaleTo640x360_Linear (293 ms)  293 219 133.79%
ARGBToARGBMirror_Any (1135 ms)      1135    853 133.06%
I420ToNV21_Any (350 ms)         350 265 132.08%
I420ToI411_Opt (991 ms)         991 753 131.61%
I420ToI422_Opt (434 ms)         434 330 131.52%
I420ToI411_Unaligned (992 ms)       992 762 130.18%
SetPlane_Unaligned (111 ms)     111 86  129.07%
ScaleDownBy3_Box (1908 ms)      1908    1484    128.57%
ARGBInterpolate128_Invert (2409 ms) 2409    1893    127.26%
I422ToI422_Invert (472 ms)      472 374 126.20%
I420ToI444_Opt (2998 ms)        2998    2380    125.97%
ScaleDownBy3_Linear (255 ms)        255 203 125.62%
ARGBToYUY2_Invert (1581 ms)     1581    1259    125.58%
I420ToARGB1555_Opt (2696 ms)        2696    2156    125.05%
NV21ToRGB565_Any (2479 ms)      2479    1987    124.76%
NV12ToRGB565_Opt (2419 ms)      2419    1946    124.31%

this script tests all functions with an odd width (1914) and even.

set LIBYUV_WIDTH=1914
set LIBYUV_HEIGHT=1080
set LIBYUV_REPEAT=999
set LIBYUV_FLAGS=-1

@call :runtest *

set LIBYUV_WIDTH=1920
set LIBYUV_HEIGHT=1080
set LIBYUV_REPEAT=999
set LIBYUV_FLAGS=-1

@call :runtest *
goto :eof

:runtest
out\release\libyuv_unittest --gtest_filter=*%* | findstr /r "^[^_]*_[^_]*ms"

Then take the output and put it into a document and sort by those with largest 
difference.
Typically this indicates falling back on C for the entire function.

Original issue reported on code.google.com by fbarch...@google.com on 23 Apr 2015 at 7:12

GoogleCodeExporter commented 9 years ago
change odd width test to 1912, which allows /4 and 3/4 to be tested.  otherwise 
scaling falls back on general purpose scaler, not specialized scaler.

todo: consider scale factor unittest that ensures source and destination sizes 
test the exact factor, including on chroma which is 1/2 size.

setlocal
set LIBYUV_WIDTH=1912
set LIBYUV_HEIGHT=1080
set LIBYUV_REPEAT=999
set LIBYUV_FLAGS=-1

@call :runtest *

set LIBYUV_WIDTH=1920
set LIBYUV_HEIGHT=1080
set LIBYUV_REPEAT=999
set LIBYUV_FLAGS=-1

@call :runtest *
goto :eof

:runtest
out\release\libyuv_unittest --gtest_filter=*%* | findstr /r "^[^_]*_[^_]*ms"

Original comment by fbarch...@google.com on 29 Apr 2015 at 8:52