cornell-zhang / heterocl

HeteroCL: A Multi-Paradigm Programming Infrastructure for Software-Defined Heterogeneous Computing
https://cornell-zhang.github.io/heterocl/
Apache License 2.0
326 stars 92 forks source link

Loop fusion after split fails to work #223

Open chhzh123 opened 4 years ago

chhzh123 commented 4 years ago

In the following example, I split the B loop in the same shape of C, and try to merge them together.

A = hcl.placeholder((64,), "A")

def kernel(A):
    B = hcl.compute((64,), lambda x: A[x], "B")
    C = hcl.compute((8, 8), lambda x, y: A[x * 8 + y] + B[x * 8 + y], "C")
    return C

s = hcl.create_schedule([A], kernel)
kernel_B = kernel.B
kernel_C = kernel.C
x_out, x_in = s[kernel_B].split(kernel_B.axis[0],8)
s[kernel_B].compute_at(s[kernel_C], kernel_C.axis[1])
print(hcl.lower(s))

However, it causes SegFault.

Other compute functions like A[x*8+y]+A[x*8+y] or kernel_C.axis[0] work, so maybe the problem comes from incorrect axis access when scheduling.