Closed georges-arm closed 1 month ago
I'm not really sure what the current intended interaction with is between enabling SIMD since we have both VVENC_ENABLE_X86_SIMD
and VVENC_ENABLE_ARM_SIMD
which can technically be enabled independently of each other. I don't think all combinations of this currently compile but also not sure if it is something we are interested in supporting. This seems slightly orthogonal to the existing 32-bit compilation issues so I've not tried to solve it here, but interested in hearing what you think is the best approach to take here?
Thanks for incorporating my suggestions. From your code around enabling SVE/SVE2 I assume these extensions are not available for ARMv7 ? Then I think the split between ARM
and AARCH64
for VVENC_TARGET_ARCH
really makes sense.
Concerning the interaction between VVENC_ENABLE_X86_SIMD
and VVENC_ENABLE_ARM_SIMD
, my intention was that all combinations are possible. I know it worked at some point in time, but I am not sure about the current status. I will look into it, when your changes are merged.
Thanks for incorporating my suggestions. From your code around enabling SVE/SVE2 I assume these extensions are not available for ARMv7 ?
Your interpretation is correct. Both SVE and SVE2 are only available in the 64-bit architecture (available from Armv8.2-A and Armv9.0-A respectively).
Thanks for the MR @georges-arm and for the review @K-os
Concerning the interaction between
VVENC_ENABLE_X86_SIMD
andVVENC_ENABLE_ARM_SIMD
, my intention was that all combinations are possible. I know it worked at some point in time, but I am not sure about the current status. I will look into it, when your changes are merged.
This should be fixed by #455. Thanks for confirming the intent here.
The existing Neon code makes use of some intrinsics which are only available on AArch64, so adjust these implementations to also work on Armv7 platforms.
Also introduce a new
REAL_TARGET_AARCH64
macro to distinguish between 32-bit and 64-bit in new and existing Arm code.Since these implementations require that Neon is available, also add
-mfpu=neon
to the compile line for these files. This necessitates removing the SIMD implementation files from LTO in the same way that is already done for the x86 kernels.