google / XNNPACK

High-efficiency floating-point neural network inference operators for mobile, server, and Web
Other
1.84k stars 357 forks source link

ARMv7 (with NEON) can not support on Linux but only support ARMv7 (with NEON) on Android #6210

Open cv-on-device opened 6 months ago

cv-on-device commented 6 months ago

Hi, I find that XNNPACK can not support ARMv7 (with NEON) on Linux, but only support ARMv7 (with NEON) on Android?

part error log: /tensorflow/build_arm/xnnpack/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-2x2c4-minmax-fp32-armsimd32.c:80:15: error: unknown type name ‘int16x2_t’ const int16x2_t va1c02 = __sxtb16(va1);

fbarchard commented 6 months ago

armsimd32 is ARMv6 style simd - 4 bytes. It provides optimization on cpus without NEON.

In bazel there is a section with the build options applied:

xnnpack_cc_library(
    name = "armsimd32_bench_microkernels",
    aarch32_copts = [
        "-marm",
        "-march=armv6",
        "-mfpu=vfp",
        "-munaligned-access",
    ],
    aarch32_srcs = ALL_ARMSIMD32_MICROKERNEL_SRCS,
    gcc_copts = xnnpack_gcc_std_copts() + [
        "-fno-fast-math",
        "-fno-math-errno",
    ],
    msvc_copts = xnnpack_msvc_std_copts(),
    deps = [
        ":common",
        ":math",
        ":microkernels_h",
        ":microparams",
        ":prefetch",
        ":tables",
        ":unaligned",
    ],
)

and CMakeLists.txt has similar options

  SET_PROPERTY(SOURCE ${ALL_MICROKERNEL_SRCS} APPEND_STRING PROPERTY COMPILE_FLAGS " -marm ")
  SET_PROPERTY(SOURCE ${PROD_MICROKERNEL_SRCS} APPEND_STRING PROPERTY COMPILE_FLAGS " -marm ")
  SET_PROPERTY(SOURCE ${ALL_ARMSIMD32_MICROKERNEL_SRCS} APPEND_STRING PROPERTY COMPILE_FLAGS " -march=armv6 -mfpu=vfp -munaligned-access ")
  SET_PROPERTY(SOURCE ${PROD_ARMSIMD32_MICROKERNEL_SRCS} APPEND_STRING PROPERTY COMPILE_FLAGS " -march=armv6 -mfpu=vfp -munaligned-access ")

The types come from the arm header #include <arm_acle.h>

/* 8.5.5 Packing and unpacking */
#if defined(__ARM_FEATURE_SIMD32) && __ARM_FEATURE_SIMD32
typedef int32_t int8x4_t;
typedef int32_t int16x2_t;
typedef uint32_t uint8x4_t;
typedef uint32_t uint16x2_t;

Can you confirm the build system and compiler you used and that -march=armv6 is used on qs8-qc8w-igemm-2x2c4-minmax-fp32-armsimd32.c

These kernels are meant for Cortex M series that dont have Neon For aarch32 builds that do have Neon, such as Raspberry Pi, the linux builds should work, but no script is provided

cv-on-device commented 6 months ago

@fbarchard Thank you for your help. Why this new version not provide the script for ARMv7 (with NEON) on Linux?

fbarchard commented 6 months ago

There is an armv7 script for android. When I tried it with NDK 21 it had a build error against I8MM due to an old version of clang being used, so I made cmake check the clang version and disable it.

Are you using scripts/build-local.sh ?

alankelly commented 3 months ago

Is this issues still relevant? Can we close?

cv-on-device commented 3 months ago

Is this issues still relevant? Can we close?

Sure, thank you