Xilinx / mlir-aie

An MLIR-based toolchain for AMD AI Engine-enabled devices.
Other
289 stars 82 forks source link

I8 vector contract lowering issue #1600

Closed erwei-xilinx closed 2 months ago

erwei-xilinx commented 2 months ago

From IREE-AMD-AIE when we try to lower an i8 conv2d example with the vectorization pass, we encounter the following error with the following code snippet:

Error message

/proj/xcohdstaff3/erweiw/iree-ipu/workspace/debug_numerical/elemwise/module_conv_2d_nhwc_hwcf_dispatch_0_amdaie_xclbin_fb/conv_2d_nhwc_hwcf_dispatch_0_conv_2d_nhwc__0.aiecc.mlir:995:11: error: failed to legalize operation 'vector.contract' that was explicitly marked illegal
    %18 = vector.contract {indexing_maps = [affine_map<(d0, d1, d2, d3) -> (d0, d1, d3)>, affine_map<(d0, d1, d2, d3) -> (d3, d2)>, affine_map<(d0, d1, d2, d3) -> (d0, d1, d2)>], iterator_types = ["parallel", "parallel", "parallel", "reduction"], kind = #vector.kind<add>} %16, %17, %14 : vector<1x4x8xi32>, vector<8x8xi32> into vector<1x4x8xi32>
          ^
/proj/xcohdstaff3/erweiw/iree-ipu/workspace/debug_numerical/elemwise/module_conv_2d_nhwc_hwcf_dispatch_0_amdaie_xclbin_fb/conv_2d_nhwc_hwcf_dispatch_0_conv_2d_nhwc__0.aiecc.mlir:995:11: note: see current operation: %47 = "vector.contract"(%44, %46, %41) <{indexing_maps = [affine_map<(d0, d1, d2, d3) -> (d0, d1, d3)>, affine_map<(d0, d1, d2, d3) -> (d3, d2)>, affine_map<(d0, d1, d2, d3) -> (d0, d1, d2)>], iterator_types = [#vector.iterator_type<parallel>, #vector.iterator_type<parallel>, #vector.iterator_type<parallel>, #vector.iterator_type<reduction>], kind = #vector.kind<add>}> : (vector<1x4x8xi32>, vector<8x8xi32>, vector<1x4x8xi32>) -> vector<1x4x8xi32>
/proj/xcohdstaff3/erweiw/iree-ipu/workspace/debug_numerical/elemwise/module_conv_2d_nhwc_hwcf_dispatch_0_amdaie_xclbin_fb/conv_2d_nhwc_hwcf_dispatch_0_conv_2d_nhwc__0.aiecc.mlir:0:0: error: 'builtin.module' op Failed to lower to LLVM

Code snippet:

 %6 = vector.transfer_read %alloc_23[%c0_13, %c0_13, %c0_13, %c0_13], %c0_i8_14 {in_bounds = [true, true, true]} : memref<1x1x4x8xi8, 2 : i32>, vector<1x4x8xi8>
   %7 = vector.transfer_read %alloc_24[%c0_13, %c0_13, %c0_13, %c0_13], %c0_i8_14 {in_bounds = [true, true, true]} : memref<1x1x8x8xi8, 2 : i32>, vector<1x8x8xi8>
   %8 = vector.transfer_read %alloc_20[%c0_13, %c0_13, %c0_13, %c0_13], %c0_i32_12 {in_bounds = [true, true, true]} : memref<1x1x4x8xi32, 2 : i32>, vector<1x4x8xi32>
   %9 = vector.extract %7[0] : vector<8x8xi8> from vector<1x8x8xi8>
   %10 = arith.extsi %6 : vector<1x4x8xi8> to vector<1x4x8xi32>
   %11 = arith.extsi %9 : vector<8x8xi8> to vector<8x8xi32>
   %12 = vector.contract {indexing_maps = [affine_map<(d0, d1, d2, d3) -> (d0, d1, d3)>, affine_map<(d0, d1, d2, d3) -> (d3, d2)>, affine_map<(d0, d1, d2, d3) -> (d0, d1, d2)>], iterator_types = ["parallel", "parallel", "parallel", "reduction"], kind = #vector.kind<add>} %10, %11, %8 : vector<1x4x8xi32>, vector<8x8xi32> into vector<1x4x8xi32>

Full ir dump: https://gist.github.com/erwei-xilinx/29c5529e166ecfd35a4525f87495e7d0

erwei-xilinx commented 2 months ago

Cc: @jsetoain

erwei-xilinx commented 2 months ago

Thanks for working on this fix, @jsetoain ! The issue is fixed.