Closed GoogleCodeExporter closed 9 years ago
Patch updated.
Fixed minor style issues and added the YUY2 case.
Original comment by changjun...@intel.com
on 1 Apr 2013 at 6:52
Attachments:
I counter propose SSE2 switch from aligned to unaligned
https://webrtc-codereview.appspot.com/1274005
Pros
Worst case is better for apps that dont align memory.
Less code than aligned (SSE2) and unaligned (AVX).
Cons
Atom and Core2 performance is worse
Original comment by fbarch...@google.com
on 2 Apr 2013 at 10:24
r634 changes I420ToUYVY_SSE2 to use unaligned movdqu
Linux Core2 Before
[ OK ] libyuvTest.ARGBToUYVY_Any (2163 ms)
[ OK ] libyuvTest.ARGBToUYVY_Unaligned (2078 ms)
[ OK ] libyuvTest.I420ToUYVY_Unaligned (1104 ms)
[ OK ] libyuvTest.I422ToUYVY_Unaligned (1101 ms)
[ OK ] libyuvTest.I420ToUYVY_Any (1101 ms)
[ OK ] libyuvTest.ARGBToUYVY_Invert (1061 ms)
[ OK ] libyuvTest.ARGBToUYVY_Opt (1048 ms)
[ OK ] libyuvTest.I422ToUYVY_Invert (235 ms)
[ OK ] libyuvTest.I422ToUYVY_Any (225 ms)
[ OK ] libyuvTest.I422ToUYVY_Opt (224 ms)
[ OK ] libyuvTest.I420ToUYVY_Invert (212 ms)
[ OK ] libyuvTest.I420ToUYVY_Opt (211 ms)
OK ] libyuvTest.ARGBToUYVY_Random (30 ms)
-----] 13 tests from libyuvTest (10794 ms total)
Linux Core2 After
[ OK ] libyuvTest.ARGBToUYVY_Unaligned (2252 ms)
[ OK ] libyuvTest.ARGBToUYVY_Any (1567 ms)
[ OK ] libyuvTest.ARGBToUYVY_Invert (1301 ms)
[ OK ] libyuvTest.ARGBToUYVY_Opt (1246 ms)
[ OK ] libyuvTest.I420ToUYVY_Unaligned (478 ms)
[ OK ] libyuvTest.I422ToUYVY_Unaligned (444 ms)
[ OK ] libyuvTest.I420ToUYVY_Any (413 ms)
[ OK ] libyuvTest.I420ToUYVY_Invert (386 ms)
[ OK ] libyuvTest.I422ToUYVY_Invert (384 ms)
[ OK ] libyuvTest.I420ToUYVY_Opt (326 ms)
[ OK ] libyuvTest.I422ToUYVY_Opt (325 ms)
[ OK ] libyuvTest.I422ToUYVY_Any (324 ms)
[ OK ] libyuvTest.ARGBToUYVY_Random (31 ms)
[----------] 13 tests from libyuvTest (9478 ms total)
45% slower for I422ToUYVY_Opt
Original comment by fbarch...@google.com
on 2 Apr 2013 at 10:12
Optimized for unaligned SSE2 instead
Original comment by fbarch...@chromium.org
on 4 Apr 2013 at 6:37
Original issue reported on code.google.com by
changjun...@intel.com
on 29 Mar 2013 at 3:42Attachments: