Closed danoli3 closed 2 months ago
Fix ideas:
#ifdef _M_X64 // or another macro specific to x86/64
#include <immintrin.h> // Include necessary headers for x86/64 intrinsics
// Use _tzcnt_u32 for x86/64
#define TZS_INTRINSIC _tzcnt_u32
#elif defined(__aarch64__) // Macro for ARM64 architecture
static inline uint32_t TZS_INTRINSIC(uint32_t x) {
return __builtin_ctz(x); // GCC/Clang intrinsic for count trailing zeros
}
#else
#error Unsupported architecture
#endif
uint32_t result = TZS_INTRINSIC(value);
Current work around:
-DBUILD_opencv_rgbd=OFF \
Hello, I'm not friendly to "ARM64EC", so I'm sorry if I have mistake. And I have no test environment about ARM64EC building. If my patch works well, please could you response ? I'l create pull requests.
On OpenCV code, cnt is used only for zlib-ng and hal.
kmtr@kmtr-VMware-Virtual-Platform:~/work/opencv4/modules$ git reset --hard
HEAD is now at 15783d6598 Merge pull request #25792 from asmorkalov:as/HAL_fast_GaussianBlur
kmtr@kmtr-VMware-Virtual-Platform:~/work/opencv4/modules$ git grep tzcnt
core/include/opencv2/core/hal/intrin.hpp: // clang-cl doesn't export _tzcnt_u32 for non BMI systems
core/include/opencv2/core/hal/intrin.hpp: return _tzcnt_u32(value);
I think we have to add condition for ARM64EC to stop using _tzcnt_u32()
.
Microsoft document is https://learn.microsoft.com/en-us/cpp/preprocessor/predefined-macros?view=msvc-170
For ARM64EC target, "_M_ARM" and "_M_ARM64" is undefined. So OpenCV try to use _tzcnt_u32()
like x86-64.
- _M_ARM Defined as the integer literal value 7 for compilations that target ARM processors. Undefined for ARM64, ARM64EC, and other targets.
- _M_ARM64 Defined as 1 for compilations that target ARM64. Otherwise, undefined.
- _M_ARM64EC Defined as 1 for compilations that target ARM64EC. Otherwise, undefined.
So I think we should add condition for M_ARM64 .
- #if (_MSC_VER < 1700) || defined(_M_ARM) || defined(_M_ARM64)
+ #if (_MSC_VER < 1700) || defined(_M_ARM) || defined(_M_ARM64) || defined(_M_ARM64EC)
unsigned long index = 0;
_BitScanForward(&index, value);
return (unsigned int)index;
Yeah that should fix it!
Thank you for your report ! I'll change pull request status to remove draft !
System information (version)
Detailed description
Compile for VS2022 - MSVC - CMakeList - ARM64 / ARM64EC
Steps to reproduce
Error:
https://github.com/openframeworks/apothecary/actions/runs/9874992463/job/27270618800?pr=390
buildarm64ec.log
Issue submission checklist
[x] I report the issue, it's not a question
[x] I checked the problem with documentation, FAQ, open issues, forum.opencv.org, Stack Overflow, etc and have not found any solution
[x] I updated to the latest OpenCV version and the issue is still there
[x] There is reproducer code and related data files: videos, images, onnx, etc