cornell-zhang / heterocl

HeteroCL: A Multi-Paradigm Programming Infrastructure for Software-Defined Heterogeneous Computing
https://cornell-zhang.github.io/heterocl/
Apache License 2.0
326 stars 92 forks source link

Incorrect code generation when two schedule share same input placeholder objects #388

Open paldebjit opened 3 years ago

paldebjit commented 3 years ago

Issue statement

When two or more schedules (sharing the same set of inputs) are derived (via hcl.create_schedule()) from the same algorithmic specification, schedules created earlier in the sequence get prepended to the schedules later in the sequence. This results in wrong code generation for each of the backends from the second in sequence schedule onward.

Root causing

The first investigation shows that it happens due to the usage of the same placeholder objects as the input of the schedules. A quick fix is to create two sets of placeholder objects with corresponding placeholders having the same name field. But this needs extra work for the user. A more robust fix is needed.

How to reproduce?

$ python test_issue_1_1.py

import heterocl as hcl

def top_2mm(P=16, Q=22, R=18, S=24, alpha=0.1, beta=0.1, dtype=hcl.Float(), target=None):

    hcl.init(dtype)
    A = hcl.placeholder((P, Q), "A")
    B = hcl.placeholder((Q, R), "B")
    C = hcl.placeholder((R, S), "C")
    D = hcl.placeholder((P, S), "D")

    def kernel_2mm(A, B, C, D):

        r = hcl.reduce_axis(0, Q, "r")
        out_AB = hcl.compute((P, R), 
                         lambda x, y: hcl.sum(A[x, r] * B[r, y], 
                         axis=r, 
                         dtype=dtype
                         ), 
                         name="out_AB"
                         )

        k = hcl.reduce_axis(0, R, "k")
        out_ABC = hcl.compute((P, S), 
                         lambda x, y: hcl.sum(out_AB[x, k] * C[k, y], 
                         axis=k, 
                         dtype=dtype
                         ), 
                         name="out_ABC"
                         )
        hcl.update(D,
                   lambda x, y: (alpha * out_ABC[x, y] + beta * D[x, y]),
                   name="D"
                   )

    s_orig = hcl.create_schedule([A, B, C, D], kernel_2mm)
    s_transfo = hcl.create_schedule([A, B, C, D], kernel_2mm)

    print(hcl.build(s_orig, target=target))
    print(hcl.build(s_transfo, target=target))

f = top_2mm(target="vhls")

This will compile two schedules s_orig and s_transfo. Since s_transfo is second in sequence and shares the same set of placeholder objects A, B, C, D with s_orig, s_orig gets prepended to s_transfo. Generated VHLS code (printed on the terminal) shows this as well.

python test_issue_1_2.py

import heterocl as hcl

def top_2mm(P=16, Q=22, R=18, S=24, alpha=0.1, beta=0.1, dtype=hcl.Float(), target=None):

    hcl.init(dtype)
    A1 = hcl.placeholder((P, Q), "A")
    B1 = hcl.placeholder((Q, R), "B")
    C1 = hcl.placeholder((R, S), "C")
    D1 = hcl.placeholder((P, S), "D")

    A2 = hcl.placeholder((P, Q), "A")
    B2 = hcl.placeholder((Q, R), "B")
    C2 = hcl.placeholder((R, S), "C")
    D2 = hcl.placeholder((P, S), "D")

    def kernel_2mm(A, B, C, D):

        r = hcl.reduce_axis(0, Q, "r")
        out_AB = hcl.compute((P, R), 
                         lambda x, y: hcl.sum(A[x, r] * B[r, y], 
                         axis=r, 
                         dtype=dtype
                         ), 
                         name="out_AB"
                         )

        k = hcl.reduce_axis(0, R, "k")
        out_ABC = hcl.compute((P, S), 
                         lambda x, y: hcl.sum(out_AB[x, k] * C[k, y], 
                         axis=k, 
                         dtype=dtype
                         ), 
                         name="out_ABC"
                         )
        hcl.update(D,
                   lambda x, y: (alpha * out_ABC[x, y] + beta * D[x, y]),
                   name="D"
                   )

    s_orig = hcl.create_schedule([A1, B1, C1, D1], kernel_2mm)
    s_transfo = hcl.create_schedule([A2, B2, C2, D2], kernel_2mm)

    print(hcl.build(s_orig, target=target))
    print(hcl.build(s_transfo, target=target))

f = top_2mm(target="vhls")

This does not have the above effect since s_orig and s_transfo do not share the same placeholder objects.