Open ZihengJiang opened 3 years ago
@Hzfengsy It is a bit weird that we have an error thrown in the lowered TIR. Would you like to take a look if it is related to our buffer flatten work?
@ZihengJiang would you like to provide the IR being scheduled? Thanks a lot!
Here it is:
@tvm.script.tir
def dyn_mm(a: ty.handle, b: ty.handle, c: ty.handle, M: ty.int32, N: ty.int32) -> None:
A = tir.match_buffer(a, (M, 1024), "float32")
B = tir.match_buffer(b, (1024, N), "float32")
C = tir.match_buffer(c, (M, N), "float32")
with tir.block([M, N, tir.reduce_axis(0, 1024)], "matmul") as [vi, vj, vk]:
with tir.init():
C[vi, vj] = 0.0
C[vi, vj] = C[vi, vj] + A[vi, vk] * B[vk, vj]
It is a bug about predicate
. Looking at the scheduled tir, i0_outer_inner
is used at the first block_realize but defined after it. I admit that we did not consider carefully about the predicate during scheduling. Would be great if you can find the exact primitive that cause the problem. Sorry for bringing troubles.
looking again, I think this is due to the reduction split, we will need to detect the predicates that related to the loops of the init and remove the predicates that touches the reduction var(which is not in the init). if there are predicate that touches both, we cannot do reduction split.
Is this bug fixed on mainline?
I found that the lowering procedure will fail while some itervars' extent is 1.
IThe IR is:
Schedule is:
Error message: