Closed GoogleCodeExporter closed 9 years ago
OSX
Was
45541 - [ OK ] libyuvTest.I420ToARGB_Opt (45541 ms)
43062 - [ OK ] libyuvTest.J420ToARGB_Opt (43062 ms)
Now
44052 - [ OK ] libyuvTest.I420ToARGB_Opt (44052 ms)
43110 - [ OK ] libyuvTest.J420ToARGB_Opt (43110 ms)
Original comment by fbarch...@google.com
on 29 Dec 2014 at 11:05
SSSE3 was
I420ToARGB_Opt (5169 ms)
now
I420ToARGB_Opt (4830 ms)
Original comment by fbarch...@google.com
on 30 Dec 2014 at 4:08
This is complete for x86 code.
Arm used signed math, so there is no bias for the -128 that would make it free.
Also the shift on arm can round, so theres no add for rounding. Unlike x86
which requires an add to do rounding.
Its unclear if the bias could be done.. it would need unsigned versions of
multiplies/adds. Likely. Its likely not a performance benefit, so it would
mainly ensure the code exactly mimics the x86/c code.
Closing as fixed. Followup improvements to unittests would be good and/or
reexamine code for arm or x86 performance improvements.
Original comment by fbarch...@google.com
on 5 Jan 2015 at 6:36
Original issue reported on code.google.com by
fbarch...@google.com
on 29 Dec 2014 at 9:43