dilagurung / libyuv

Automatically exported from code.google.com/p/libyuv
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

llvm is slow on min/max #221

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
unittests on osx are slow when built with llvm/clang.
ARGBSubtract, ARGBAdd and I420ToARGB when running C version.

Original issue reported on code.google.com by fbarch...@chromium.org on 18 Apr 2013 at 9:28

GoogleCodeExporter commented 9 years ago
ARGBSubtract_Unaligned (12619 ms)
ARGBSubtract_Opt (346 ms)
ARGBAdd_Unaligned (13690 ms)
ARGBAdd_Any (388 ms)
ARGBAdd_Opt (371 ms)
ARGBMultiply_Unaligned (2805 ms)     
ARGBMultiply_Opt (425 ms)

Original comment by phthor...@gmail.com on 19 Apr 2013 at 9:45

GoogleCodeExporter commented 9 years ago
add/subtract fixed in r672

C versions
3063 - [       OK ] libyuvTest.ARGBSubtract_Unaligned (3063 ms)
3054 - [       OK ] libyuvTest.ARGBSubtract_Opt (3054 ms)
3054 - [       OK ] libyuvTest.ARGBSubtract_Any (3054 ms)
3042 - [       OK ] libyuvTest.ARGBSubtract_Invert (3042 ms)
3483 - [       OK ] libyuvTest.ARGBAdd_Any (3483 ms)
3472 - [       OK ] libyuvTest.ARGBAdd_Invert (3472 ms)
3468 - [       OK ] libyuvTest.ARGBAdd_Unaligned (3468 ms)
3467 - [       OK ] libyuvTest.ARGBAdd_Opt (3467 ms)
2876 - [       OK ] libyuvTest.ARGBMultiply_Opt (2876 ms)
2799 - [       OK ] libyuvTest.ARGBMultiply_Invert (2799 ms)
2768 - [       OK ] libyuvTest.ARGBMultiply_Any (2768 ms)
2702 - [       OK ] libyuvTest.ARGBMultiply_Unaligned (2702 ms)

Opt versions improved
363 - [       OK ] libyuvTest.ARGBSubtract_Unaligned (363 ms)
360 - [       OK ] libyuvTest.ARGBSubtract_Any (360 ms)
340 - [       OK ] libyuvTest.ARGBSubtract_Invert (340 ms)
337 - [       OK ] libyuvTest.ARGBSubtract_Opt (337 ms)
377 - [       OK ] libyuvTest.ARGBSubtract_Unaligned (377 ms)
363 - [       OK ] libyuvTest.ARGBSubtract_Any (363 ms)
350 - [       OK ] libyuvTest.ARGBSubtract_Opt (350 ms)
345 - [       OK ] libyuvTest.ARGBSubtract_Invert (345 ms)
453 - [       OK ] libyuvTest.ARGBMultiply_Unaligned (453 ms)
431 - [       OK ] libyuvTest.ARGBMultiply_Any (431 ms)
426 - [       OK ] libyuvTest.ARGBMultiply_Opt (426 ms)
418 - [       OK ] libyuvTest.ARGBMultiply_Invert (418 ms)

Original comment by fbarch...@chromium.org on 19 Apr 2013 at 10:14

GoogleCodeExporter commented 9 years ago
ash-3.2$ ./runem | grep OK

C
8657 - [       OK ] libyuvTest.I420ToARGB_Opt (8657 ms)
13823 - [       OK ] libyuvTest.ARGBSobel_Opt (13823 ms)
6675 - [       OK ] libyuvTest.TestARGBColorMatrix (6675 ms)
3702 - [       OK ] libyuvTest.TestARGBSepia (3702 ms)

Opt
549 - [       OK ] libyuvTest.I420ToARGB_Opt (549 ms)
935 - [       OK ] libyuvTest.ARGBSobel_Opt (935 ms)
603 - [       OK ] libyuvTest.TestARGBColorMatrix (603 ms)
607 - [       OK ] libyuvTest.TestARGBSepia (607 ms)

Original comment by fbarch...@chromium.org on 20 Apr 2013 at 8:49

GoogleCodeExporter commented 9 years ago
These functions still need llvm optimization review
I420ToRGBA_Opt (5610 ms)
I420ToBGRA_Opt (5599 ms)
I420ToARGB1555_Opt (5483 ms)
I420ToABGR_Opt (5453 ms)
I420ToARGB4444_Opt (5158 ms)
I420ToRGB565_Opt (5138 ms)
I420ToBayerBGGR_Opt (4821 ms)
I420ToBayerRGGB_Opt (4817 ms)
I420ToBayerGBRG_Opt (4817 ms)
I420ToBayerGRBG_Opt (4815 ms)
I420ToARGB_Opt (4509 ms)
I420ToRGB24_Opt (4407 ms)
I420ToRAW_Opt (4396 ms)
Remove yuvpixel and use yuvpixel2.

Original comment by fbarch...@chromium.org on 23 Apr 2013 at 4:00

GoogleCodeExporter commented 9 years ago
fixed in r676

I420ToARGB_Opt (5887 ms)
I420ToBGRA_Opt (5887 ms)
I420ToABGR_Opt (5889 ms)
I420ToRGBA_Opt (5844 ms)
I420ToRAW_Opt (5671 ms)
I420ToRGB24_Opt (5688 ms)
I420ToRGB565_Opt (6654 ms)
I420ToARGB1555_Opt (6767 ms)
I420ToARGB4444_Opt (6488 ms)
I420ToBayerBGGR_Opt (6094 ms)
I420ToBayerRGGB_Opt (6284 ms)
I420ToBayerGBRG_Opt (6180 ms)
I420ToBayerGRBG_Opt (6073 ms)
libyuvTest (79407 ms total)

Original comment by fbarch...@chromium.org on 23 Apr 2013 at 9:03