ARM-software / ComputeLibrary

The Compute Library is a set of computer vision and machine learning functions optimised for both Arm CPUs and GPUs using SIMD technologies.
MIT License
2.82k stars 774 forks source link

Unsigned int overflow in PoolingDepthfirstGeneric #1037

Closed alvoron closed 4 months ago

alvoron commented 1 year ago

Output of 'strings libarm_compute.so | grep arm_compute_version': arm_compute_version=v23.02 Build options: {'neon': '1', 'opencl': '0', 'openmp': '0', 'cppthreads': '1', 'examples': '0', 'Werror': '0', 'gemm_tuner': '0', 'reference_openmp': '0', 'validation_tests': '0', 'benchmark_tests': '0', 'data_layout_support': 'all', 'build_dir': '<project_dir>/thirdparty/ComputeLibrary', 'install_dir': '<project_dir>/thirdparty/ComputeLibrary/install', 'arch': 'armv8.2-a', 'debug': '1', 'asserts': '1', 'logging': '1', 'os': 'macos', 'build': 'native', 'compiler_prefix': '/usr/bin/', 'extra_cxx_flags': '-fPIC -fsigned-char -ffunction-sections -fdata-sections -fdiagnostics-show-option -Wundef -Wreturn-type -Wunused-variable -Wswitch -Wno-macro-redefined -Wno-undef -Wno-missing-declarations -fvisibility-inlines-hidden -Wall -Wno-unknown-pragmas -fvisibility=internal -mcpu=native -Wno-undef -Wno-error=return-stack-address'} Git hash=b'f8f7ede7a01eb5cd9d06060b4d2f2d1404d93f29'

Platform: Apple M1

Operating System: macOS 12.6

Problem description: I faced EXC_BAD_ACCESS crash in PoolingDepthfirstGeneric::compute_tile_padded() while using NEPoolingLayer with NHWC layout. Overflow of valid_rows or valid_cols variables in PoolingDepthfirstGeneric::compute_tile_padded() could happen if padding sum is greater than pool_window rows and cols:

const auto valid_rows = this->m_args.pool_window.rows - (pad_top + pad_bottom);
const auto valid_cols = this->m_args.pool_window.cols - (pad_left + pad_right); // 2 - (0 + 4) = 4294967294

Before running NEPoolingLayer kernel, the validate() method has been run to check configuration.

alvoron commented 8 months ago

The issue is not reproducible on Raspberry Pi.

morgolock commented 8 months ago

Hi @alvoron

In order to help I'll need more details.

Could you please share more information about the workload configuration that caused the problem on macOS? If you build ACL with logging=1 the library will print the arguments passed to ::configure()

alvoron commented 8 months ago

[ComputeLibrary][31-01-2024 11:17:58][INFO]  arm_compute::cpu::CpuPool2d::configure() : 
 src: Shape=112,112,64,1,DataLayout=NHWC,DataType=F32
 dst: Shape=56,56,64,1,DataLayout=NHWC,DataType=F32
 pool_info: {Type=MAX,DataLayout=NHWC,IsGlobalPooling=0,PoolSize=3,3,PadStride=2,2;1,1,1,1}
 indices: nullptr
morgolock commented 6 months ago

Hi @alvoron

Thanks, we managed to reproduce and we are working to fix the problem.

morgolock commented 6 months ago

Hi @alvoron

This is the patch fixing the problem https://review.mlplatform.org/c/ml/ComputeLibrary/+/11290

The fix will be included in 24.05

Hope this helps

alvoron commented 6 months ago

Thank you for the patch. I'll test it as soon as we upgrade ACL to 24.05

morgolock commented 4 months ago

Closing as this was already delivered in the last release. Please reopen if you still require support.