ARM-software / ComputeLibrary

The Compute Library is a set of computer vision and machine learning functions optimised for both Arm CPUs and GPUs using SIMD technologies.
MIT License
2.82k stars 774 forks source link

CpuGemmConv2d operator has incorrect setting for INT8_NHWC Data. #1039

Closed GGGGxxxxxxxxr closed 2 days ago

GGGGxxxxxxxxr commented 1 year ago

Hi!

I have just configured the NEConvolutionLayer with NHWC_QASYMM8_SIGNED Tensor, and I have found there are two issues here:

  1. All the INT8 NHWC Convolution Operator would not skip reshaping kernel after GEMM, which does not make any sense, as GEMMOUTPUT equals to the output tensor of layer in terms of continuous data within memory under the layout of NHWC.
  2. 1x1 ConvolutionKernel would still call Im2Col kernel, which does not make sense. 1x1 Kernel under NHWC layout should be simple GEMM after weight reshaping.

The previous two problems are caused by skip_im_col_info() function in CpuGemmConv2d.cpp, and the actual problem is in the lower part of CpuGemmLowpMatrixMultiplyCore::Validate().

3*.The reshaping kernel itself has been poorly implemented under NHWC layout in kernel level, as the value assignment is implemented value by value, which could be easily optimized with MEMCPY() row by row, just simply setting WindowDim::x as (0,1,1) and set the corresponding iterator.

I have modified this problem myself in CpuGemmConv2d.cpp and I have tested with random numbers, the result is mathematically correct which I have verified, I suppose this issue should be fixed, otherwise the performance for this setting would be greatly affected.

morgolock commented 2 days ago

Closing this due to the lack of activity. If this is still a problem please reopen.

It would be helpful if you could provide a standalone test reproducing the problem.