Closed GoogleCodeExporter closed 9 years ago
r1002 has scale, compare and rotate with unaligned support for intel.
mips requires alignment still.
Original comment by fbarch...@google.com
on 2 Oct 2014 at 5:56
The following appear to be alignment checks that should be checked
d:\src\libyuv\trunk\source>findstr -i aligned.*stride.*16 *
format_conversion.cc: IS_ALIGNED(src_argb, 16) &&
IS_ALIGNED(src_stride_argb, 16)) {
rotate_argb.cc: IS_ALIGNED(dst, 16) && IS_ALIGNED(dst_stride, 16)) {
rotate_argb.cc: IS_ALIGNED(src, 16) && IS_ALIGNED(src_stride, 16) &&
rotate_argb.cc: IS_ALIGNED(dst, 16) && IS_ALIGNED(dst_stride, 16)) {
rotate_argb.cc: IS_ALIGNED(src, 16) && IS_ALIGNED(src_stride, 16) &&
rotate_argb.cc: IS_ALIGNED(dst, 16) && IS_ALIGNED(dst_stride, 16)) {
scale.cc: IS_ALIGNED(dst_width, 8) && IS_ALIGNED(row_stride, 16) &&
scale.cc: IS_ALIGNED(src_ptr, 16) && IS_ALIGNED(src_stride, 16)) {
scale.cc: IS_ALIGNED(dst_width, 8) && IS_ALIGNED(row_stride, 16) &&
scale.cc: IS_ALIGNED(src_ptr, 16) && IS_ALIGNED(src_stride, 16)) {
scale.cc: IS_ALIGNED(src_ptr, 16) && IS_ALIGNED(src_stride, 16)) {
scale.cc: IS_ALIGNED(src_ptr, 16) && IS_ALIGNED(src_stride, 16)) {
scale.cc: IS_ALIGNED(src_ptr, 16) && IS_ALIGNED(src_stride, 16)) {
scale.cc: IS_ALIGNED(src_ptr, 16) && IS_ALIGNED(src_stride, 16)) {
scale.cc: IS_ALIGNED(src_ptr, 16) && IS_ALIGNED(src_stride, 16)) {
scale.cc: IS_ALIGNED(src_ptr, 16) && IS_ALIGNED(src_stride, 16)) {
scale.cc: IS_ALIGNED(src_ptr, 16) && IS_ALIGNED(src_stride, 16) &&
scale.cc: IS_ALIGNED(dst_ptr, 16) && IS_ALIGNED(dst_stride, 16)) {
scale.cc: IS_ALIGNED(src_ptr, 16) && IS_ALIGNED(src_stride, 16) &&
scale.cc: IS_ALIGNED(dst_ptr, 16) && IS_ALIGNED(dst_stride, 16)) {
scale.cc: IS_ALIGNED(src_ptr, 16) && IS_ALIGNED(src_stride, 16) &&
scale.cc: IS_ALIGNED(dst_ptr, 16) && IS_ALIGNED(dst_stride, 16)) {
scale.cc: IS_ALIGNED(src_ptr, 16) && IS_ALIGNED(src_stride, 16) &&
scale.cc: IS_ALIGNED(dst_ptr, 16) && IS_ALIGNED(dst_stride, 16)) {
scale_argb.cc: IS_ALIGNED(src_argb, 16) && IS_ALIGNED(row_stride, 16) &&
scale_argb.cc: IS_ALIGNED(dst_argb, 16) && IS_ALIGNED(dst_stride, 16)) {
scale_argb.cc: IS_ALIGNED(src_argb, 16) && IS_ALIGNED(row_stride, 16) &&
scale_argb.cc: IS_ALIGNED(dst_argb, 16) && IS_ALIGNED(dst_stride, 16)) {
scale_argb.cc: IS_ALIGNED(dst_argb, 16) && IS_ALIGNED(dst_stride, 16)) {
scale_argb.cc: IS_ALIGNED(src_argb, 16) && IS_ALIGNED(src_stride, 16) &&
scale_argb.cc: IS_ALIGNED(dst_argb, 16) && IS_ALIGNED(dst_stride, 16)) {
scale_argb.cc: IS_ALIGNED(src_argb, 16) && IS_ALIGNED(src_stride, 16) &&
scale_argb.cc: IS_ALIGNED(dst_argb, 16) && IS_ALIGNED(dst_stride, 16)) {
scale_argb.cc: IS_ALIGNED(src_argb, 16) && IS_ALIGNED(src_stride, 16) &&
scale_argb.cc: IS_ALIGNED(dst_argb, 16) && IS_ALIGNED(dst_stride, 16)) {
Original comment by fbarch...@google.com
on 6 Oct 2014 at 11:13
The following fail:
[ PASSED ] 862 tests.
[ FAILED ] 7 tests, listed below:
[ FAILED ] libyuvTest.ARGBToI420_Unaligned
[ FAILED ] libyuvTest.ARGBToJ420_Unaligned
[ FAILED ] libyuvTest.BGRAToI420_Unaligned
[ FAILED ] libyuvTest.ABGRToI420_Unaligned
[ FAILED ] libyuvTest.RGBAToI420_Unaligned
[ FAILED ] libyuvTest.ARGBToNV12_Unaligned
[ FAILED ] libyuvTest.ARGBToNV21_Unaligned
7 FAILED TESTS
due to pavgb with memory on SSSE3.
Original comment by fbarch...@google.com
on 7 Oct 2014 at 12:11
r1115 functions, but uses some C code for ARGBToI420.
r1116 uses movdqu instead of pavgb for memory references.
Original comment by fbarch...@google.com
on 7 Oct 2014 at 8:07
Benchmarks on Sandy Bridge:
Was r1096
linux64 868 tests from libyuvTest (553265 ms total)
osx64 868 tests from libyuvTest (594923 ms total)
win32 868 tests from libyuvTest (628448 ms total)
r1116
linux64 885 tests from libyuvTest (543007 ms total)
osx64 885 tests from libyuvTest (588508 ms total)
win32 885 tests from libyuvTest (613399 ms total)
Original comment by fbarch...@google.com
on 8 Oct 2014 at 12:45
Example where old code would use C for unaligned:
Was
ARGBToARGB4444_Unaligned (1400 ms)
ARGBToARGB4444_Any (386 ms)
ARGBToARGB4444_Invert (317 ms)
ARGBToARGB4444_Opt (310 ms)
ARGBToARGB4444_Random (261 ms)
1 test case ran. (2674 ms total)
Now
ARGBToARGB4444_Unaligned (426 ms)
ARGBToARGB4444_Any (380 ms)
ARGBToARGB4444_Invert (332 ms)
ARGBToARGB4444_Opt (318 ms)
ARGBToARGB4444_Random (268 ms)
1 test case ran. (1724 ms total)
Original comment by fbarch...@google.com
on 9 Oct 2014 at 2:23
Original comment by fbarch...@google.com
on 14 Oct 2014 at 12:26
Original issue reported on code.google.com by
fbarch...@google.com
on 30 Sep 2014 at 5:43