google / XNNPACK

High-efficiency floating-point neural network inference operators for mobile, server, and Web
Other
1.77k stars 336 forks source link

Failed to compile XNNPACK on WoA(Windows on ARM) device. #6558

Open zhanweiw opened 1 month ago

zhanweiw commented 1 month ago

It seems part of the code haven't been compiled. Any idea on how to fix it? Thanks in advance!

FAILED: subgraph-size-test.exe
C:\windows\system32\cmd.exe /C "cd . && C:\Programs\Python\Python311-arm64\Lib\site-packages\cmake\data\bin\cmake.exe -E vs_link_exe --intdir=CMakeFiles\subgraph-size-test.dir --rc=C:\PROGRA~2\WI3CF2~1\10\bin\100226~1.0\arm64\rc.exe --mt=C:\PROGRA~2\WI3CF2~1\10\bin\100226~1.0\arm64\mt.exe --manifests  -- C:\Programs\LLVM\bin\lld-link.exe /nologo CMakeFiles\subgraph-size-test.dir\test\subgraph-size.c.obj  /out:subgraph-size-test.exe /implib:subgraph-size-test.lib /pdb:subgraph-size-test.pdb /version:0.0 /machine:ARM64 /debug /INCREMENTAL /subsystem:console  XNNPACK.lib  cpuinfo\cpuinfo.lib  pthreadpool\pthreadpool.lib  kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib && cd ."
LINK Pass 1: command "C:\Programs\LLVM\bin\lld-link.exe /nologo CMakeFiles\subgraph-size-test.dir\test\subgraph-size.c.obj /out:subgraph-size-test.exe /implib:subgraph-size-test.lib /pdb:subgraph-size-test.pdb /version:0.0 /machine:ARM64 /debug /INCREMENTAL /subsystem:console XNNPACK.lib cpuinfo\cpuinfo.lib pthreadpool\pthreadpool.lib kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib /MANIFEST /MANIFESTFILE:CMakeFiles\subgraph-size-test.dir/intermediate.manifest CMakeFiles\subgraph-size-test.dir/manifest.res" failed (exit code 1) with the following output:
lld-link: error: undefined symbol: xnn_f16_vabs_ukernel__neonfp16arith_u16
>>> referenced by C:\zhanweiw\tf_lite\XNNPACK\src\configs\unary-elementwise-config.c:181
>>>               XNNPACK.lib(unary-elementwise-config.c.obj):(init_f16_abs_config)
>>> referenced by C:\zhanweiw\tf_lite\XNNPACK\src\configs\unary-elementwise-config.c:181
>>>               XNNPACK.lib(unary-elementwise-config.c.obj):(init_f16_abs_config)
fbarchard commented 3 weeks ago

Hi thanks for the report.

When I give a quick try with blaze which is like bazel, I'm able to build the abs bench blaze build --config=lexan_x86_64 -c opt //third_party/XNNPACK/bench:abs_bench

The microkernel is checked in, declared and used: grep xnn_f16_vabs_ukernelneonfp16arith_u16 . -r ./src/amalgam/gen/neonfp16arith.c:void xnn_f16_vabs_ukernelneonfp16arith_u16( ./src/configs/unary-elementwise-config.c: f16_abs_config.ukernel = (xnn_vunary_ukernel_fn) xnn_f16_vabs_ukernel__neonfp16arith_u16; ./src/configs/unary-elementwise-config.c: f16_abs_config.ukernel = (xnn_vunary_ukernel_fn) xnn_f16_vabs_ukernelneonfp16arith_u16; ./src/xnnpack/vunary.h:DECLARE_F16_VABS_UKERNEL_FUNCTION(xnn_f16_vabs_ukernelneonfp16arith_u16) ./src/f16-vunary/gen/f16-vabs-neonfp16arith-u16.c:void xnn_f16_vabs_ukernelneonfp16arith_u16( ./bench/f16-vabs.cc: xnn_f16_vabs_ukernelneonfp16arith_u16, ./test/f16-vabs.cc: .TestAbs(xnn_f16_vabs_ukernelneonfp16arith_u16); ./test/f16-vabs.cc: .TestAbs(xnn_f16_vabs_ukernelneonfp16arith_u16); ./test/f16-vabs.cc: .TestAbs(xnn_f16_vabs_ukernelneonfp16arith_u16); ./test/f16-vabs.cc: .TestAbs(xnn_f16_vabs_ukernelneonfp16arith_u16); ./test/f16-vabs.cc: .TestAbs(xnn_f16_vabs_ukernelneonfp16arith_u16); ./test/f16-vabs.yaml:- name: xnn_f16_vabs_ukernelneonfp16arith_u16

The important one for linking is the kernel is in ./src/amalgam/gen/neonfp16arith.c which gets built and linked on arm systems unless fp16 is disabled.

For CMakeList.txt around line 509 IF(XNNPACK_ENABLE_ARM_FP16_VECTOR) LIST(APPEND PROD_MICROKERNEL_SRCS ${PROD_NEONFP16ARITH_MICROKERNEL_SRCS}) LIST(APPEND PROD_MICROKERNEL_SRCS ${PROD_NEONFP16ARITH_AARCH64_MICROKERNEL_SRCS}) ENDIF() appends the fp16 microkernels

Its possible our cmake is missing something for Windows. in scripts/build-windows-arm64.cmd There are CMake parameters for a VS2017 build:

mkdir build\windows
mkdir build\windows\arm64

set CMAKE_ARGS=-DXNNPACK_LIBRARY_TYPE=static -DXNNPACK_ENABLE_ASSEMBLY=OFF -DXNNPACK_ENABLE_ARM_FP16_SCALAR=OFF -DXNNPACK_ENABLE_ARM_BF16=OFF
set CMAKE_ARGS=%CMAKE_ARGS% -G="Visual Studio 17 2022" -A=ARM64

rem Use-specified CMake arguments go last to allow overridding defaults
set CMAKE_ARGS=%CMAKE_ARGS% %*

echo %CMAKE_ARGS%

cd build\windows\arm64 && cmake ..\..\.. %CMAKE_ARGS%
cmake --build . -j %NUMBER_OF_PROCESSORS% --config Release
zhanweiw commented 3 weeks ago

Thanks for your supporting! I've tried to disable the 'XNNPACK_ENABLE_ASSEMBLY' and it works. But if we disabled this feature, it will impact the performance, right? Is that possible to enable 'XNNPACK_ENABLE_ASSEMBLY' on ARM64 windows?

fbarchard commented 2 weeks ago

The arm assembly is in .S files meant to be compiled with gcc or clang. As far as I know theres no way to assemble them with Visual Studio.

The best solution is compiling with clang or clangcl