Closed gitoss closed 5 months ago
Out of interest, I ran media-autobuild_suite myself using -march=znver4 in custom_profile ... and it's the same result.
Svt[info]: SVT [version]: SVT-AV1-PSY Encoder Lib v2.1.0-A-1-g37a5609
Svt[info]: SVT [build] : Clang 18.1.6 64 bit
Svt[info]: LIB Build date: Jun 11 2024 00:17:04
Svt[info]: -------------------------------------------
Svt[info]: [asm level on system : up to avx512]
Svt[info]: [asm level selected : up to avx2]
I'm really not used to compile something for myself anymore, so I have no idea if enabling AVX512 would work with just CFLAGS, or it needs something in configure or cmake (which seems to have some logic to auto-detect for -DEN_AVX512_SUPPORT=1)
The ab-suite.cmake.log shows that at least CMAKE is detected, but I don't know if this is sufficient.
-- Checking C flag support for: [-mavx512f] - Yes
-- Checking C flag support for: [-mavx512bw] - Yes
-- Checking C flag support for: [-mavx512dq] - Yes
-- Checking C flag support for: [-mavx512vl] - Yes
-- Checking CXX flag support for: [-mavx512f] - Yes
-- Checking CXX flag support for: [-mavx512bw] - Yes
-- Checking CXX flag support for: [-mavx512dq] - Yes
-- Checking CXX flag support for: [-mavx512vl] - Yes
Sorry to be a bother, I know this is tricky if you cannot test it on a AVX512 capable cpu yourself. The original repo seems to have managed it for Linux: https://github.com/gianni-rosato/svt-av1-psy/releases/tag/v2.1.0-A
The avx512 support must be specified when compiling the binary. I will publish an update at the weekend.
-march=znver4 does not activate avx512 support. To do this, the command -DENABLE_AVX512=ON must be added to line 1542 of media-suite_compile.sh.
The avx512 support must be specified when compiling the binary. I will publish an update at the weekend.
Thanks. It would be nice to know how you do it, i.e. what configure/cmake options to use. I've asked the original psy dev, too https://github.com/gianni-rosato/svt-av1-psy/discussions/57
-march=znver4 does not activate avx512 support. To do this, the command -DENABLE_AVX512=ON must be added to line 1542 of media-suite_compile.sh.
Thanks, that did it.
Svt[info]: SVT [version]: SVT-AV1-PSY Encoder Lib v2.1.0-A-1-g37a5609
Svt[info]: SVT [build] : Clang 18.1.6 64 bit
Svt[info]: LIB Build date: Jun 12 2024 00:08:50
Svt[info]: -------------------------------------------
Svt[info]: [asm level on system : up to avx512]
Svt[info]: [asm level selected : up to avx512]
Btw, this is different than libjxl which uses CFLAGS.
JPEG XL encoder v0.10.2 c158d65 [AVX3_DL]
To do this, the command -DENABLE_AVX512=ON must be added to line 1542 of media-suite_compile.sh.
... or you could create a file like "svt-av1-psy-git_options" in the build directory, containing "-DENABLE_AVX512=ON" - which is the more official-ish way to add cflags as far as I understand it.
Not really. Included in the main folder of the repsitory is a CMakelists.txt where all options are defined and accessed by every build process! Here, the AVX512 option is already defined and the option is activated via CMake.
In addition, the CMake procedure is called in the media-suite_compile.sh. Cflags are not necessary there.
In addition, the CMake procedure is called in the media-suite_compile.sh. Cflags are not necessary there.
Right, thanks - I didn't realize the _options.txt are only for configure CFLAGS, not for CMAKE -Dsomething
Updated my releases
Updated my releases
Thanks. Can you tell what the actual difference is between the gcc & msvc builds, or does it depend on systems & circumstances?
The MSVC builds were compiled with Visual Studio and the GCC builds with MSYS (cross-compile). Different compilers
The MSVC builds were compiled with Visual Studio and the GCC builds with MSYS (cross-compile). Different compilers
That I know :-) ... I was wondering if you figured out what the actual differene / effect is, like encoding speed. For example the vstudio binary from you is 1 fps faster than my own llwm mediasuite build (w/o lto), I didn't check your gcc yet and it wasn't a real benchmark anyway.
Anyway, thanks for the binaries, in the fullness of time I'll figure out which compiler (vstudio, gcc, llvm) works best for my zen4 system.
In most cases, the MSVC versions are faster than the cross-compiled versions, which may be due to compatibility with the Windows machine. This is similar with x265. For this reason I would also like to compile x264 with msvc and maybe I can do the same with rav1e.
In most cases, the MSVC versions are faster than the cross-compiled versions, which may be due to compatibility with the Windows machine. This is similar with x265. For this reason I would also like to compile x264 with msvc and maybe I can do the same with rav1e.
For what its worth, you can cross-compile w/ lto (either gcc or clang), which results in a significant speedup. https://www.reddit.com/r/AV1/comments/jmwepw/how_to_build_libaomav1_to_be_as_fast_as_possible/
... instructions for llvm https://github.com/m-ab-s/media-autobuild_suite/issues/2669 https://clang.llvm.org/docs/ThinLTO.html
I guess vstudio will still be faster because Microsoft had a lot of time to optimize for the Windows platform. But because I haven't used Visual Studio for ages, I'm stuck with media autobuild suite for now to optimize for my specific -march (in my case, znver4).
vstudio is faster simply because UCRT's libm is faster, and mingw-w64 overrided the UCRT libm implementation with their shit x87 fpu implementation. You might consider building svtav1 with clang-cl -Xclang -O3 -flto=thin, which would combine UCRT's high-performance libm with clang's advanced autovectorization, and also can override NT malloc.
You might consider building svtav1 with clang-cl -Xclang -O3 -flto=thin, which would combine UCRT's high-performance libm with clang's advanced autovectorization, and also can override NT malloc.
Edit, again: Ok, I now understand your instructions are for using the Visual Studio front-end and the llvm compiler - and with mingw (media autobuild suite) this isn't possible.
Forwarding the issue from Staxrip to the suitable repo: https://github.com/staxrip/staxrip/issues/1387
_Describe the bug SvtAv1EncApp.exe seems to be built only for AVX2, and doesn't enable AVX512.
Expected behavior The 'asm level selected' should be avx512 if 'asm level on system" is avx512.
How to reproduce the issue Using a cpu supporting AVX512 (like AMD Zen4 or Intel Tiger Lake) and run SvtAv1EncApp.exe with either --asm avx512 or --asm max
Provide information The svt encoder has --asm max and should auto-select to best asm level for each system, so as far as I understand it it's not necessary to build a binary limited to AVX2?
Additional context Other SvtAv1EncApp Windows binaries available on github seem to be limited to AVX2, too - but have binaries supporting AVX512 for Linux: https://github.com/gianni-rosato/svt-av1-psy/releases_