ylb11 / openjpeg

Automatically exported from code.google.com/p/openjpeg
Other
0 stars 0 forks source link

Add SSE2/SSE41 implementations for mct.c #451

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
SSE2 implementations for opj_mct_encode/opj_mct_decode (not all compilers do 
automatic vectorization). It's a as fast as automatic vectorization code from 
Xcode 6.1.1 clang on both x86 & x64. This will be enabled by default when 
compiler defines __SSE2__ using static dispatching (this will be the only code 
path). All x64 builds (except Visual Studio) & x86 builds on MacOs.

SSE41 implementation for opj_mct_encode_real still using static dispatching 
(__SSE4_1__). 1.4x faster on x64, 3.0x faster on x86. This won't be enabled by 
default on any builds.

To get those to be available on all builds, Issue 450 needs to be resolved.

Original issue reported on code.google.com by m.darb...@gmail.com on 13 Dec 2014 at 12:23

Attachments:

GoogleCodeExporter commented 9 years ago
This issue was closed by revision r2957.

Original comment by m.darb...@gmail.com on 13 Dec 2014 at 12:29