Asm int8 gemm result isn't correct

ARM-software / ComputeLibrary

The Compute Library is a set of computer vision and machine learning functions optimised for both Arm CPUs and GPUs using SIMD technologies.

2.87k stars 782 forks source link

Output of 'strings libarm_compute.so | grep arm_compute_version': arm_compute_version=v22.02 Build options: {'Werror': '1', 'debug': '1', 'neon': '1', 'opencl': '0', 'os': 'linux', 'arch': 'arm64-v8.2-a-sve'} Git hash=unknown arm_compute_version.embed

Platform:
AArch64, armv9 ,sve supported

Operating System: Ubuntu 20.04.3 LTS

Problem description: Compiled "examples/neon_gemm_qasymm8.cpp" for test.
cd build ./neon_gemm_qasymm8 16 16 12 the result was correct. But the gemm kernel used C++ code instead of asm optimised code.

Then after I've modified the codes:

and "src/cpu/kernels/CpuGemmLowpOffsetContributionOutputStageKernel.cpp" validate_arguments(): ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN(output, 1, DataType::QASYMM8, DataType::QASYMM8_SIGNED, DataType::S32);

to force the kernel make use of asm optimised algorithms, the result isn't correct!

Could anyone give me some hints on how to use optimised algo of int8 gemm ? Any tips are appreciated!

ARM-software / ComputeLibrary

Asm int8 gemm result isn't correct #985