Closed newling closed 2 months ago
The logic I implemented before was not what we wanted: we don't mind having 3+ tile sizes greater than 1, it's when there are 3+ loop counts greater than 1 that we are in trouble when targetting the AIE array. i.e.
scf.forall ... (0, 0, 0, 0) to (1, 64, 12, 12) step (1, 32, 12, 12)
is fine at the thread level, as there is only 1 dimension with a loop count greater than 1. On the other hand
scf.forall ... (0, 0, 0, 0) to (1, 64, 12, 12) step (1, 1, 1, 1)
lands us in trouble: there are 3 dimensions with loop count greater than 1, even though all the tiles are size 1.
This PR fixes the logic.
The logic I implemented before was not what we wanted: we don't mind having 3+ tile sizes greater than 1, it's when there are 3+ loop counts greater than 1 that we are in trouble when targetting the AIE array. i.e.
scf.forall ... (0, 0, 0, 0) to (1, 64, 12, 12) step (1, 32, 12, 12)
is fine at the thread level, as there is only 1 dimension with a loop count greater than 1. On the other hand
scf.forall ... (0, 0, 0, 0) to (1, 64, 12, 12) step (1, 1, 1, 1)
lands us in trouble: there are 3 dimensions with loop count greater than 1, even though all the tiles are size 1.
This PR fixes the logic.