apache / tvm

Open deep learning compiler stack for cpu, gpu and specialized accelerators
https://tvm.apache.org/
Apache License 2.0
11.84k stars 3.48k forks source link

[Bug] dense.py in matmul tensor_b.shape mismatch after autoscheduler tuning in ARM CPU #10309

Closed xueshengke closed 2 months ago

xueshengke commented 2 years ago
File "/host/mlperf_environment_xsk/tvm/tvm_build_armv9/tvm/python/tvm/_ffi/_ctypes/packed_func.py", line 81, in cfun
            rv = local_pyfunc(*pyargs)
          File "/host/mlperf_environment_xsk/tvm/tvm_build_armv9/tvm/python/tvm/relay/backend/te_compiler.py", line 311, in lower_call
            best_impl, outputs = select_implementation(op, call.attrs, inputs, ret_type, target)
          File "/host/mlperf_environment_xsk/tvm/tvm_build_armv9/tvm/python/tvm/relay/backend/te_compiler.py", line 189, in select_implementation
            outs = best_plevel_impl.compute(attrs, inputs, out_type)
          File "/host/mlperf_environment_xsk/tvm/tvm_build_armv9/tvm/python/tvm/relay/op/op.py", line 126, in compute
            return _OpImplementationCompute(self, attrs, inputs, out_type)
          File "/host/mlperf_environment_xsk/tvm/tvm_build_armv9/tvm/python/tvm/_ffi/_ctypes/packed_func.py", line 237, in __call__
            raise get_last_ffi_error()
          3: TVMFuncCall
          2: std::_Function_handler<void (tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*), tvm::relay::__mk_TVM6::{lambda(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)#1}>::_M_invoke(std::_Any_data const&, tvm::runtime::TVMArgs&&, tvm::runtime::TVMRetValue*&&)
          1: tvm::relay::OpImplementation::Compute(tvm::Attrs const&, tvm::runtime::Array<tvm::te::Tensor, void> const&, tvm::Type const&)
          0: std::_Function_handler<void (tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*), TVMFuncCreateFromCFunc::{lambda(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)#2}>::_M_invoke(std::_Any_data const&, tvm::runtime::TVMArgs&&, tvm::runtime::TVMRetValue*&&)
          File "/host/mlperf_environment_xsk/tvm/tvm_build_armv9/tvm/python/tvm/_ffi/_ctypes/packed_func.py", line 81, in cfun
            rv = local_pyfunc(*pyargs)
          File "/host/mlperf_environment_xsk/tvm/tvm_build_armv9/tvm/python/tvm/relay/op/strategy/generic.py", line 833, in _compute_dense
            return [topi_compute(*args)]
          File "/host/mlperf_environment_xsk/tvm/tvm_build_armv9/tvm/python/tvm/topi/nn/dense.py", line 184, in dense
            return matmul(data, weight, bias, out_dtype, False, True, auto_scheduler_rewritten_layout)
          File "/host/mlperf_environment_xsk/tvm/tvm_build_armv9/tvm/python/tvm/topi/nn/dense.py", line 81, in matmul
            out_dim, red_dim = tensor_b.shape
ValueError: too many values to unpack (expected 2)

Expected behavior

tensor_b.shape will be changed after tuning ?

Actual behavior

out_dim, red_dim != tensor_b.shape

Environment

Any environment details, such as: Operating System, TVM version, etc Linux, Ubuntu: 21.04, TVM 0.9 dev, ARM CPU,
Model : mobilenet-v3.tflite

Steps to reproduce

This error happens only in ARM CPU after autotuning, but disappears before autotuning.

cgerum commented 2 years ago

This can be solved by changing: https://github.com/apache/tvm/blob/8f6fa8f2c41406cb54d01647ba8731e4ceb8f4ab/python/tvm/relay/op/strategy/arm_cpu.py#L466

to:

     wrap_compute_dense(topi.nn.dense, need_auto_scheduler_layout=True),

This seems to be missing in all implementations for arm_cpu.