Closed VVD closed 4 days ago
Thank you for this post.
I think I need to use Intel's sde to check what happens on older platforms, as part of testing.
I think I know why this is happening, and I will try to fix it soon.
Kind regards, Aous.
I had a deeper look. I am assuming you are using 64-bit architecture.
The code in ojph::local::initialize_tables_avx2 does not run, unless the CPU support AVX2 or higher.
bool initialize_tables_avx2() {
if (get_cpu_ext_level() >= X86_CPU_EXT_LEVEL_AVX2) {
bool result;
result = vlc_init_tables();
result = result && uvlc_init_tables();
return result;
}
return false;
}
I tested with Intel SDE on linux, the code, compiled with GCC, runs without issues for Pentium 4 Prescott, Merom and Penryn Architectures. I think core 2 cpus belong to either Merom and Penryn.
I am about to release a new version.
In this version, I changed some code in that file, and in ojph_block_encode_avx512.cpp. In particular,
I removed = { 0 }
from
static ui32 vlc_tbl0[2048];
static ui32 vlc_tbl1[2048];
My worry is that = { 0 }
will be called automatically, because it is statically defined, and since the code has the -mavx2 option, it will definitely use avx2 instructions.
To counter the effect of that I changed the above code to
/////////////////////////////////////////////////////////////////////////
bool initialize_tables_avx2() {
if (get_cpu_ext_level() >= X86_CPU_EXT_LEVEL_AVX2) {
memset(vlc_tbl0, 0, 2048 * sizeof(ui32));
memset(vlc_tbl1, 0, 2048 * sizeof(ui32));
bool result;
result = vlc_init_tables();
result = result && uvlc_init_tables();
return result;
}
return false;
}
It might be also useful to change the code around get_cpu_ext_level() to
////////////////////////////////////////////////////////////////////////////
static int cpu_level;
static bool cpu_level_initialized;
////////////////////////////////////////////////////////////////////////////
int get_cpu_ext_level()
{
if (!cpu_level_initialized)
cpu_level_initialized = init_cpu_ext_level(cpu_level);
return cpu_level;
}
You can wait for the new release -- should be today or tomorrow (Sydney time). You can test these suggestions.
Hope this help.
Kind regards, Aous.
Ofc 64bit: OS: FreeBSD 14.1 amd64.
Build with LLVM 18 from base system.
If you can commit these changes, testing will be easier.
Same error.
Thank you for putting this in.
Just to keep you in the picture. I had to do some digging and the problem only occurred with clang when the code is compiled in Release build type. It does not happen with gcc, nor in other build types.
The source of the problem is that clang inserts vzeroupper instruction in a normal C++ function (where no intrinsics are used). It does this at the end of the function, just before returning.
One solution could have been using the following compiler flags
-mllvm -x86-use-vzeroupper=0
But the solution I implemented is better.
See how it goes with you and if this solves the problem, please let me know.
Cheers, Aous.
PS: See this https://stackoverflow.com/questions/68736527/do-i-need-to-use-mm256-zeroupper-in-2021
Fixed in version 0.18.0. Thanks!
Offtopic:
$ readelf -d /usr/local/lib/libopenjph.so | grep SONAME
0x000000000000000e SONAME Library soname: [libopenjph.so.0.18]
SONAME "must be" libopenjph.so.0
.
Patch:
--- src/core/CMakeLists.txt.orig 2024-11-10 02:36:26 UTC
+++ src/core/CMakeLists.txt
@@ -133,9 +133,9 @@ else()
PROPERTIES
OUTPUT_NAME "openjph.${OPENJPH_VERSION_MAJOR}.${OPENJPH_VERSION_MINOR}")
else()
- set(OJPH_LIB_NAME_STRING "openjph.${OPENJPH_VERSION_MAJOR}.${OPENJPH_VERSION_MINOR}")
+ set(OJPH_LIB_NAME_STRING "openjph.${OPENJPH_VERSION_MAJOR}")
set_target_properties(openjph
PROPERTIES
- SOVERSION "${OPENJPH_VERSION_MAJOR}.${OPENJPH_VERSION_MINOR}"
+ SOVERSION "${OPENJPH_VERSION_MAJOR}"
VERSION "${OPENJPH_VERSION}")
endif()
Thank you for the suggestion regarding the SONAME name.
I am aware that this is not what is expected -- as in issue #155. I am still not very sure what is the best course of action. This library is my first contribution to open source development.
Kind regards, Aous.
Thanks.
OS: FreeBSD 14.1 amd64. CPU: Core 2 Quad Q6600. Build OpenJPH 0.17.0 from ports without
-march
and with-march=core2
- same result. Dependencies list: smplayer => qt5 => kf5-kimageformats => libheif => OpenJPH.AFAIU:
0xC5 0xF8 0x77
= vzeroupper from AVX. https://fuchsia.googlesource.com/third_party/llvm-project/+/refs/tags/llvmorg-13.0.0-rc1/llvm/test/CodeGen/X86/fma.ll?autodive=0%2F%2F%2F%2F%2F%2F%2F%2F#665 https://en.wikipedia.org/wiki/Advanced_Vector_Extensions#New_instructionsRuntime detection of supported SIMD level in init_cpu_ext_level() work well, but OpenJPH still use instructions from unsupported SIMDs on current CPU.
There are build options:
If I build with
-DOJPH_DISABLE_SSE4=ON -DOJPH_DISABLE_AVX=ON -DOJPH_DISABLE_AVX2=ON -DOJPH_DISABLE_AVX512=ON
then smplayer run and work fine.