Open zhanweiw opened 1 month ago
Hi thanks for the report.
When I give a quick try with blaze which is like bazel, I'm able to build the abs bench blaze build --config=lexan_x86_64 -c opt //third_party/XNNPACK/bench:abs_bench
The microkernel is checked in, declared and used: grep xnn_f16_vabs_ukernelneonfp16arith_u16 . -r ./src/amalgam/gen/neonfp16arith.c:void xnn_f16_vabs_ukernelneonfp16arith_u16( ./src/configs/unary-elementwise-config.c: f16_abs_config.ukernel = (xnn_vunary_ukernel_fn) xnn_f16_vabs_ukernel__neonfp16arith_u16; ./src/configs/unary-elementwise-config.c: f16_abs_config.ukernel = (xnn_vunary_ukernel_fn) xnn_f16_vabs_ukernelneonfp16arith_u16; ./src/xnnpack/vunary.h:DECLARE_F16_VABS_UKERNEL_FUNCTION(xnn_f16_vabs_ukernelneonfp16arith_u16) ./src/f16-vunary/gen/f16-vabs-neonfp16arith-u16.c:void xnn_f16_vabs_ukernelneonfp16arith_u16( ./bench/f16-vabs.cc: xnn_f16_vabs_ukernelneonfp16arith_u16, ./test/f16-vabs.cc: .TestAbs(xnn_f16_vabs_ukernelneonfp16arith_u16); ./test/f16-vabs.cc: .TestAbs(xnn_f16_vabs_ukernelneonfp16arith_u16); ./test/f16-vabs.cc: .TestAbs(xnn_f16_vabs_ukernelneonfp16arith_u16); ./test/f16-vabs.cc: .TestAbs(xnn_f16_vabs_ukernelneonfp16arith_u16); ./test/f16-vabs.cc: .TestAbs(xnn_f16_vabs_ukernelneonfp16arith_u16); ./test/f16-vabs.yaml:- name: xnn_f16_vabs_ukernelneonfp16arith_u16
The important one for linking is the kernel is in ./src/amalgam/gen/neonfp16arith.c which gets built and linked on arm systems unless fp16 is disabled.
For CMakeList.txt around line 509 IF(XNNPACK_ENABLE_ARM_FP16_VECTOR) LIST(APPEND PROD_MICROKERNEL_SRCS ${PROD_NEONFP16ARITH_MICROKERNEL_SRCS}) LIST(APPEND PROD_MICROKERNEL_SRCS ${PROD_NEONFP16ARITH_AARCH64_MICROKERNEL_SRCS}) ENDIF() appends the fp16 microkernels
Its possible our cmake is missing something for Windows. in scripts/build-windows-arm64.cmd There are CMake parameters for a VS2017 build:
mkdir build\windows
mkdir build\windows\arm64
set CMAKE_ARGS=-DXNNPACK_LIBRARY_TYPE=static -DXNNPACK_ENABLE_ASSEMBLY=OFF -DXNNPACK_ENABLE_ARM_FP16_SCALAR=OFF -DXNNPACK_ENABLE_ARM_BF16=OFF
set CMAKE_ARGS=%CMAKE_ARGS% -G="Visual Studio 17 2022" -A=ARM64
rem Use-specified CMake arguments go last to allow overridding defaults
set CMAKE_ARGS=%CMAKE_ARGS% %*
echo %CMAKE_ARGS%
cd build\windows\arm64 && cmake ..\..\.. %CMAKE_ARGS%
cmake --build . -j %NUMBER_OF_PROCESSORS% --config Release
Thanks for your supporting! I've tried to disable the 'XNNPACK_ENABLE_ASSEMBLY' and it works. But if we disabled this feature, it will impact the performance, right? Is that possible to enable 'XNNPACK_ENABLE_ASSEMBLY' on ARM64 windows?
The arm assembly is in .S files meant to be compiled with gcc or clang. As far as I know theres no way to assemble them with Visual Studio.
The best solution is compiling with clang or clangcl
It seems part of the code haven't been compiled. Any idea on how to fix it? Thanks in advance!