cornell-zhang / hcl-dialect

HeteroCL-MLIR dialect for accelerator design
https://cornell-zhang.github.io/heterocl/index.html
Other
40 stars 17 forks source link

[Frontend][Quantization] Incorrect type hint from `hcl.create_scheme` #32

Closed zzzDavid closed 2 years ago

zzzDavid commented 2 years ago

Description

Example

HeteroCL Program:

def test_resize():
    hcl.init()

    def algorithm(A):
        return hcl.compute(A.shape, lambda x: A[x] + 1, "B")

    A = hcl.placeholder((10,), dtype = hcl.UInt(32))

    scheme = hcl.create_scheme([A], algorithm)
    scheme.downsize(algorithm.B, hcl.UInt(2))
    s = hcl.create_schedule_from_scheme(scheme)
    f = hcl.build(s)

    a = np.random.randint(100, size=(10,))
    _A = hcl.asarray(a, dtype = hcl.UInt(32))
    _B = hcl.asarray(np.zeros(10), dtype = hcl.UInt(2))

    f(_A, _B)

    _A = _A.asnumpy()
    _B = _B.asnumpy()

    print(hcl.lower(s))

    for i in range(10):
        assert(_B[i] == (a[i] + 1)%4)

IR:

module {
  func @top(%arg0: memref<10xi32>) -> memref<10xi2> attributes {extra_itypes = "u", extra_otypes = "s", llvm.emit_c_interface, top} {
    %0 = memref.alloc() {name = "B", unsigned} : memref<10xi2>
    affine.for %arg1 = 0 to 10 {
      %1 = affine.load %arg0[%arg1] {from = "compute_0", unsigned} : memref<10xi32>
      %c1_i32 = arith.constant 1 : i32
      %2 = arith.addi %1, %c1_i32 {unsigned} : i32
      %3 = arith.trunci %2 : i32 to i2
      affine.store %3, %0[%arg1] {to = "B"} : memref<10xi2>
    } {loop_name = "x", stage_name = "B"}
    return %0 : memref<10xi2>
  }
}

The output type hint should be u instead of s.

zzzDavid commented 2 years ago

Some observations:

B's datatype is assigned by hcl.quantize through algorithm.B, and the extra type hint info attribute is assigned through tracing tensor uses from input to output. So, algorithm.B and A.uses[0] should refer to the same tensor in order to make type hint info correct.

Currently, these two tensor seems to be different:

image
zzzDavid commented 2 years ago

This problem is caused by this line: https://github.com/chhzh123/heterocl/blob/30138f1eca6cdb81701a7ad3a70b7b2425d6c7cd/python/heterocl/mlir/tensor.py#L141

When hcl.compute is called, it first builds a Tensor instance, which then calls ComputeOp with None as the output keyword argument. When ComputeOp is initialized with output=None, it then builds another Tensor instance as output, so that we have two different Tensor instances for a single stage.

zzzDavid commented 2 years ago

Fixed by commit: https://github.com/chhzh123/heterocl/commit/1e6c3b0345aedf28b8602d49e5fdbf9801b986fc