cornell-zhang / hcl-dialect

HeteroCL-MLIR dialect for accelerator design
https://cornell-zhang.github.io/heterocl/index.html
Other
38 stars 17 forks source link

[Frontend] Loop naming collision causes incorrect results #55

Closed chhzh123 closed 1 year ago

chhzh123 commented 2 years ago

In this case, there are two y and two x loops here causing the naming collision. B.axis[1] then refer to the inner x loop since we directly use the name of the loop_handle to refer to some loops, so if two loops have the same name, it may cause problems.

def test_compute_at_with_reuse_2D_complex():
    hcl.init()
    A = hcl.compute((10, 10), lambda y, x: x + y, "A")
    B = hcl.compute((8, 8), lambda y, x: A[y, x] + A[y+1, x+1] + A[y+2, x+2], "B")
    s = hcl.create_schedule([A])
    s[A].compute_at(s[B], B.axis[1])
    s[B].split(B.axis[1], 4)
    ir = hcl.lower(s)
module {
  func @top() -> memref<8x8xi32> attributes {extra_itypes = "", extra_otypes = "s"} {
    %0 = memref.alloc() {name = "A"} : memref<10x10xi32>
    %1 = memref.alloc() {name = "B"} : memref<8x8xi32>
    affine.for %arg0 = 0 to 8 {
      affine.for %arg1 = 0 to 8 {
        affine.for %arg2 = #map0(%arg0) to #map1(%arg0) {
          affine.for %arg3 = #map0(%arg1) to #map1(%arg1) {
            %7 = arith.addi %arg3, %arg2 : index
            %8 = arith.index_cast %7 : index to i32
            affine.store %8, %0[%arg2, %arg3] {to = "A"} : memref<10x10xi32>
          } {loop_name = "x"}
        } {loop_name = "y", stage_name = "A"}
        %2 = affine.load %0[%arg0, %arg1] {from = "A"} : memref<10x10xi32>
        %3 = affine.load %0[%arg0 + 1, %arg1 + 1] {from = "A"} : memref<10x10xi32>
        %4 = arith.addi %2, %3 : i32
        %5 = affine.load %0[%arg0 + 2, %arg1 + 2] {from = "A"} : memref<10x10xi32>
        %6 = arith.addi %4, %5 : i32
        affine.store %6, %1[%arg0, %arg1] {to = "B"} : memref<8x8xi32>
      } {loop_name = "x"}
    } {loop_name = "y", stage_name = "B"}
    return %1 : memref<8x8xi32>
  }
}

We need to ensure the loop names are different in the frontend, or use a unique pointer reference pointing to specific loops. (The latter one is probably not workable in MLIR.)

chhzh123 commented 2 years ago

Should throw out an error! Otherwise, it is hard to know where is the problem for large designs.

zzzDavid commented 1 year ago

test_compute_at_with_reuse_2D_complex is the relevant test case, after compute_at is applied, there will two loops named x and two loops named y, hence the name collision

zzzDavid commented 1 year ago

Fixed this issue by uniquely naming loop axes, this is done by the UniqueName class in context.py in the frontend: cornell-zhang/heterocl@ceab59c44bcfde8ee4e7874a99d4204379098b16

The IR after adding loop unique naming:

#map0 = affine_map<(d0, d1) -> (d0 + d1 * 4)>
#map1 = affine_map<(d0, d1) -> (d0 + d1)>
#map2 = affine_map<(d0, d1, d2) -> (d0 + d1 + d2 * 4)>
module {
  func.func @top() -> memref<8x8xi32> attributes {itypes = "", otypes = "s"} {
    %0 = memref.alloc() {name = "A"} : memref<10x10xi32>
    %1 = memref.alloc() {name = "B"} : memref<8x8xi32>
    affine.for %arg0 = 0 to 8 {
      affine.for %arg1 = 0 to 2 {
        affine.for %arg2 = 0 to 4 {
          %3 = affine.apply #map0(%arg2, %arg1)
          affine.for %arg3 = 0 to 3 {
            %14 = affine.apply #map1(%arg0, %arg3)
            affine.for %arg4 = 0 to 3 {
              %15 = affine.apply #map2(%arg4, %arg2, %arg1)
              %16 = arith.addi %15, %14 {unsigned} : index
              %17 = arith.index_cast %16 : index to i32
              affine.store %17, %0[%14, %15] {to = "A"} : memref<10x10xi32>
            } {loop_name = "x"}
          } {loop_name = "y", op_name = "A"}
          %4 = affine.load %0[%arg0, %3] {from = "A"} : memref<10x10xi32>
          %5 = affine.load %0[%arg0 + 1, %3 + 1] {from = "A"} : memref<10x10xi32>
          %6 = arith.extsi %4 : i32 to i33
          %7 = arith.extsi %5 : i32 to i33
          %8 = arith.addi %6, %7 : i33
          %9 = affine.load %0[%arg0 + 2, %3 + 2] {from = "A"} : memref<10x10xi32>
          %10 = arith.extsi %8 : i33 to i34
          %11 = arith.extsi %9 : i32 to i34
          %12 = arith.addi %10, %11 : i34
          %13 = arith.trunci %12 : i34 to i32
          affine.store %13, %1[%arg0, %3] {to = "B"} : memref<8x8xi32>
        } {loop_name = "x_0.inner"}
      } {loop_name = "x_0.outer"}
    } {loop_name = "y_0", op_name = "B"}
    %2 = hcl.create_op_handle "B"
    return %1 : memref<8x8xi32>
  }
}

Note that the second pair of x and y are automatically named to x_0 and y_0