watery01 / libyuv

Automatically exported from code.google.com/p/libyuv
0 stars 0 forks source link

ARGBToI420 for Neon #107

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
This function is currently unoptimized on NEON.
Needs 2 low levels: ARGBToY_NEON and ARGBToUV_NEON
Also do BGRA, ABGR, RGBA and perhaps RGB24/RAW.

Original issue reported on code.google.com by fbarch...@google.com on 1 Oct 2012 at 6:46

GoogleCodeExporter commented 9 years ago
Arm
ARGBToI420_OptVsC (40933 ms)
RGB24ToI420_OptVsC (64608 ms)
RGB565ToI420_OptVsC (71269 ms)

SSSE3
ARGBToI420_OptVsC (635 ms)

Needs a Neon port.

Original comment by fbarch...@google.com on 9 Oct 2012 at 5:25

GoogleCodeExporter commented 9 years ago
In r425 ARGBToI420 ported to NEON for Y channel:

sudo LIBYUV_REPEAT=1000 nice --5 ./libyuv_unittest --gtest_filter=*R*ToI*Opt | 
grep ms
ARGBToI420_Opt (6406 ms)
BGRAToI420_Opt (12089 ms)
ABGRToI420_Opt (12108 ms)
RGBAToI420_Opt (12048 ms)
RAWToI420_Opt (15801 ms)
RGB24ToI420_Opt (17645 ms)
RGB565ToI420_Opt (20757 ms)
ARGB1555ToI420_Opt (19697 ms)
ARGB4444ToI420_Opt (20080 ms)
ARGBToI422_Opt (17048 ms)
BayerBGGRToI420_Opt (16300 ms)
BayerRGGBToI420_Opt (16907 ms)
BayerGBRGToI420_Opt (15888 ms)
BayerGRBGToI420_Opt (15662 ms)
ARGBToI400_Opt (8520 ms)

Original comment by fbarch...@chromium.org on 17 Oct 2012 at 3:52

GoogleCodeExporter commented 9 years ago
r474 has Y channel in Neon
ARGBToI444_Opt (9511 ms)
ARGBToI422_Opt (7483 ms)
ARGBToI420_Opt (5348 ms)
ARGBToI411_Opt (5348 ms)
ARGBToI400_Opt (1839 ms)

Original comment by fbarch...@google.com on 6 Nov 2012 at 6:02

GoogleCodeExporter commented 9 years ago
r478 has UV channel in Neon
ARGBToI444_Opt (3675 ms)
ARGBToI422_Opt (3473 ms)
ARGBToI411_Opt (3024 ms)
ARGBToI420_Opt (2989 ms)
ARGBToI400_Opt (1853 ms)

Other RGB formats are 2 steps of Neon
BayerGRBGToI420_Opt (5822 ms)
BayerBGGRToI420_Opt (5821 ms)
BayerGBRGToI420_Opt (5774 ms)
BayerRGGBToI420_Opt (5768 ms)
ABGRToI420_Opt (5475 ms)
BGRAToI420_Opt (5470 ms)
RGBAToI420_Opt (5468 ms)
ARGB1555ToI420_Opt (4825 ms)
RGB565ToI420_Opt (4688 ms)
ARGB4444ToI420_Opt (4610 ms)
RGB24ToI420_Opt (4072 ms)
RAWToI420_Opt (4048 ms)
ARGBToI444_Opt (3675 ms)
ARGBToI422_Opt (3447 ms)
ARGBToI420_Opt (3019 ms)
ARGBToI411_Opt (2947 ms)
ARGBToI400_Opt (1841 ms)

Original comment by fbarch...@google.com on 7 Nov 2012 at 1:12

GoogleCodeExporter commented 9 years ago
r479 has full Neon 565. 20.4x faster than original C
RGB565ToI420_Opt (3492 ms)

Original comment by fbarch...@google.com on 7 Nov 2012 at 7:54

GoogleCodeExporter commented 9 years ago
r480 has full Neon 1555/4444

BayerGBRGToI420_Opt (5788 ms)
BayerBGGRToI420_Opt (5768 ms)
BayerRGGBToI420_Opt (5739 ms)
BayerGRBGToI420_Opt (5670 ms)
ABGRToI420_Opt (5516 ms)
RGBAToI420_Opt (5475 ms)
BGRAToI420_Opt (5473 ms)
RAWToI420_Opt (3909 ms)
RGB24ToI420_Opt (3872 ms)
ARGBToI444_Opt (3671 ms)
ARGB1555ToI420_Opt (3614 ms)
RGB565ToI420_Opt (3572 ms)
ARGBToI422_Opt (3557 ms)
ARGB4444ToI420_Opt (3271 ms)
ARGBToI420_Opt (2983 ms)
ARGBToI411_Opt (2959 ms)
ARGBToI400_Opt (1948 ms)

Original comment by fbarch...@chromium.org on 9 Nov 2012 at 10:49

GoogleCodeExporter commented 9 years ago
r481 fixes all RGB to I420 conversions:
BayerGRBGToI420_Opt (5839 ms)
BayerGBRGToI420_Opt (5757 ms)
BayerBGGRToI420_Opt (5757 ms)
BayerRGGBToI420_Opt (5734 ms)
RGB24ToI420_Opt (3978 ms)
RAWToI420_Opt (3816 ms)
ARGBToI444_Opt (3643 ms)
ARGB1555ToI420_Opt (3638 ms)
RGB565ToI420_Opt (3520 ms)
ARGBToI422_Opt (3428 ms)
ARGB4444ToI420_Opt (3252 ms)
ARGBToI420_Opt (2951 ms)
ARGBToI411_Opt (2936 ms)
BGRAToI420_Opt (2916 ms)
RGBAToI420_Opt (2903 ms)
ABGRToI420_Opt (2903 ms)
ARGBToI400_Opt (1904 ms)

Original comment by fbarch...@chromium.org on 13 Nov 2012 at 5:07