ARM-software / ComputeLibrary

The Compute Library is a set of computer vision and machine learning functions optimised for both Arm CPUs and GPUs using SIMD technologies.
MIT License
2.76k stars 767 forks source link

Result of Deconvolution not correct #1017

Closed daoxian closed 1 year ago

daoxian commented 1 year ago

Output of 'strings libarm_compute.so | grep arm_compute_version': arm_compute_version=v22.08 Build options: {'compiler_prefix': '/usr/bin/', 'toolchain_prefix': '/usr/bin/', 'Werror': '0', 'debug': '1', 'neon': '1', 'opencl': '0', 'os': 'linux', 'arch': 'armv8.2-a', 'benchmark_tests': '0', 'validation_tests': '0', 'examples': '1', 'extra_cxx_flags': '-fPIC'} Git hash=unknown

Platform: Arm v8.2

Operating System: CentOS 7

Problem description: I cannot get the correct result when using NEDevolutionLayer with stride=2 and pad=1 :

        int stride=2, pad=1;
        npy0.init_tensor(src0, DataType::F32);
        src0.allocator()->allocate();
        npy0.fill_tensor(src0);

        npy1.init_tensor(src1, DataType::F32);
        src1.allocator()->allocate();
        npy1.fill_tensor(src1);

        const arm_compute::PadStrideInfo padstride_info(stride, stride, pad, pad, pad, pad, arm_compute::DimensionRoundingType::FLOOR);
        auto out_dim = arm_compute::deconvolution_output_dimensions(src0.info()->tensor_shape().x(), src0.info()->tensor_shape().y(), src1.info()->tensor_shape().x(), src1.info()->tensor_shape().y(), padstride_info);
        TensorShape output_shape = arm_compute::misc::shape_calculator::compute_deconvolution_output_shape(out_dim, *src0.info(), *src1.info());
        dst.allocator()->init(TensorInfo(output_shape, 1, DataType::F32));
        dst.allocator()->allocate();
        deconv.configure(&src0, &src1, nullptr, &dst, padstride_info);

The input and output data are as follows:

data.zip All data layouts are NCHW. Any clues will be appreciated! Thanks!

daoxian commented 1 year ago

I've seen the reason: weight (src1) should be in the [width, height, IFM, OFM] layout.