skufog / libyuv

Automatically exported from code.google.com/p/libyuv
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

I420ToNV12 is slow #135

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
In r443 the function is optimized for Neon but not enabled.
I420ToNV12_Opt (1228 ms)

First pass at enabling it fails

[ RUN      ] libyuvTest.I420ToNV12_Opt
unit_test/convert_test.cc:234: Failure
Expected: (max_diff) <= (1), actual: 252 vs 1
*** glibc detected *** ./libyuv_unittest: free(): invalid next size (normal): 
0x77651f30 ***
Aborted (core dumped)
chronos@localhost $

Original issue reported on code.google.com by fbarch...@google.com on 25 Oct 2012 at 6:19

GoogleCodeExporter commented 9 years ago
Neon version fixed in r444.

Original comment by fbarch...@chromium.org on 25 Oct 2012 at 9:37

GoogleCodeExporter commented 9 years ago
Fixed for x86 as well
I420ToNV12_Any (210 ms)
I420ToNV12_Unaligned (284 ms)
I420ToNV12_Invert (194 ms)
I420ToNV12_Opt (211 ms)
I420ToNV21_Any (211 ms)
I420ToNV21_Unaligned (290 ms)
I420ToNV21_Invert (194 ms)
I420ToNV21_Opt (211 ms)
NV12ToI420_Any (240 ms)
NV12ToI420_Unaligned (310 ms)
NV12ToI420_Invert (219 ms)
NV12ToI420_Opt (227 ms)
NV21ToI420_Any (257 ms)
NV21ToI420_Unaligned (303 ms)
NV21ToI420_Invert (229 ms)
NV21ToI420_Opt (222 ms)
NV21ToI420_Unaligned (303 ms)
NV21ToI420_Invert (229 ms)
NV21ToI420_Opt (222 ms)

Arm
I420ToNV12_Any (902 ms)
I420ToNV12_Unaligned (812 ms)
I420ToNV12_Invert (1111 ms)
I420ToNV12_Opt (790 ms)
I420ToNV21_Any (896 ms)
I420ToNV21_Unaligned (810 ms)
I420ToNV21_Invert (1147 ms)
I420ToNV21_Opt (814 ms)
NV12ToI420_Any (1043 ms)
NV12ToI420_Unaligned (980 ms)
NV12ToI420_Invert (1198 ms)
NV12ToI420_Opt (962 ms)
NV21ToI420_Any (1037 ms)
NV21ToI420_Unaligned (974 ms)
NV21ToI420_Invert (1187 ms)
NV21ToI420_Opt (954 ms)

Original comment by fbarch...@chromium.org on 27 Oct 2012 at 7:09