uclouvain / openjpeg

Official repository of the OpenJPEG project
Other
971 stars 456 forks source link

Add AVX2 and AVX512 optimization #1552

Closed tszumski closed 3 weeks ago

tszumski commented 1 month ago

Encoder: performance gain ~0.1%​ Decoder: performance gain ~2.5%

rouault commented 1 month ago

Encoder: performance gain ~0.1%​ Decoder: performance gain ~2.5%

How was that measured? Total performance gain, or just on the DWT?

tszumski commented 1 month ago

Encoder: performance gain ~0.1%​ Decoder: performance gain ~2.5%

How was that measured? Total performance gain, or just on the DWT?

Total performance gain

rouault commented 1 month ago

This breaks at least Windows x64 regression tests. Did you run the regression test suite locally? Make sure to clone https://github.com/uclouvain/openjpeg-data in a "data" subdirectory of your openjpeg repository, and run ctest

tszumski commented 4 weeks ago

This breaks at least Windows x64 regression tests. Did you run the regression test suite locally? Make sure to clone https://github.com/uclouvain/openjpeg-data in a "data" subdirectory of your openjpeg repository, and run ctest

Thanks for reply, I was able to fix it. Root cause is that CI use MSVC 2015 toolset that does not have _mm256_extract_epi32 and _mm256_insert_epi32 intrinsic defined,

rouault commented 4 weeks ago

@tszumski Can you fix the formatting of the code according to the instructions at end of https://github.com/uclouvain/openjpeg/actions/runs/10718601930/job/29724100308?pr=1552, that is

* Enable WITH_ASTYLE in your cmake configuration to format C++ code
* Use "scripts/astyle.sh file" to fix the now badly indented files
* Consider using scripts/prepare-commit.sh as pre-commit hook to avoid this
  in the future (ln -s scripts/prepare-commit.sh .git/hooks/pre-commit) or
  run it manually before each commit.
tszumski commented 4 weeks ago

@tszumski Can you fix the formatting of the code according to the instructions at end of https://github.com/uclouvain/openjpeg/actions/runs/10718601930/job/29724100308?pr=1552, that is

* Enable WITH_ASTYLE in your cmake configuration to format C++ code
* Use "scripts/astyle.sh file" to fix the now badly indented files
* Consider using scripts/prepare-commit.sh as pre-commit hook to avoid this
  in the future (ln -s scripts/prepare-commit.sh .git/hooks/pre-commit) or
  run it manually before each commit.

Fixed

rouault commented 4 weeks ago

@tszumski I don't have hardware to test AVX512F and github CI doesn't seem to have machines with it. Could you paste somewhere (here or a github gist if too large):

tszumski commented 4 weeks ago

@rouault For consistency with CI scans(windows) i applied flag -DCMAKE_C_FLAGS="/arch:AVX512" And i can confirm that AVX512F is set/defined* (test logs from MSVC 2017 and MSVC 2022 are identical)

Note: *MSVC added support for AVX512 in Visual Studio 2017

$ grep C_FLAGS CMakeCache.txt
CMAKE_C_FLAGS:STRING=/arch:AVX512
CMAKE_C_FLAGS_DEBUG:STRING=/MDd /Zi /Ob0 /Od /RTC1
CMAKE_C_FLAGS_MINSIZEREL:STRING=/MD /O1 /Ob1 /DNDEBUG
CMAKE_C_FLAGS_RELEASE:STRING=/MD /O2 /Ob2 /DNDEBUG
CMAKE_C_FLAGS_RELWITHDEBINFO:STRING=/MD /Zi /O2 /Ob1 /DNDEBUG
CMAKE_RC_FLAGS:STRING=-DWIN32
CMAKE_RC_FLAGS_DEBUG:STRING=-D_DEBUG
CMAKE_RC_FLAGS_MINSIZEREL:STRING=
CMAKE_RC_FLAGS_RELEASE:STRING=
CMAKE_RC_FLAGS_RELWITHDEBINFO:STRING=
//ADVANCED property for variable: CMAKE_C_FLAGS
CMAKE_C_FLAGS-ADVANCED:INTERNAL=1
//ADVANCED property for variable: CMAKE_C_FLAGS_DEBUG
CMAKE_C_FLAGS_DEBUG-ADVANCED:INTERNAL=1
//ADVANCED property for variable: CMAKE_C_FLAGS_MINSIZEREL
CMAKE_C_FLAGS_MINSIZEREL-ADVANCED:INTERNAL=1
//ADVANCED property for variable: CMAKE_C_FLAGS_RELEASE
CMAKE_C_FLAGS_RELEASE-ADVANCED:INTERNAL=1
//ADVANCED property for variable: CMAKE_C_FLAGS_RELWITHDEBINFO
CMAKE_C_FLAGS_RELWITHDEBINFO-ADVANCED:INTERNAL=1
//ADVANCED property for variable: CMAKE_RC_FLAGS
CMAKE_RC_FLAGS-ADVANCED:INTERNAL=1
//ADVANCED property for variable: CMAKE_RC_FLAGS_DEBUG
CMAKE_RC_FLAGS_DEBUG-ADVANCED:INTERNAL=1
//ADVANCED property for variable: CMAKE_RC_FLAGS_MINSIZEREL
CMAKE_RC_FLAGS_MINSIZEREL-ADVANCED:INTERNAL=1
//ADVANCED property for variable: CMAKE_RC_FLAGS_RELEASE
CMAKE_RC_FLAGS_RELEASE-ADVANCED:INTERNAL=1
//ADVANCED property for variable: CMAKE_RC_FLAGS_RELWITHDEBINFO
CMAKE_RC_FLAGS_RELWITHDEBINFO-ADVANCED:INTERNAL=1

Failed Tests 1147/1148 are also failing on master branch without my changes

rouault commented 3 weeks ago

Failed Tests 1147/1148 are also failing on master branch without my changes

yes, those are "expected" (we have some a file in CI to ignore them

ok, so everything looks good. Merging