nod-ai / iree-amd-aie

IREE plugin repository for the AMD AIE accelerator
Apache License 2.0
69 stars 30 forks source link

[AMDAIETileAndFuse] Fix logic choosing thread/block dimensions and failure mode #758

Closed newling closed 2 months ago

newling commented 2 months ago

The logic I implemented before was not what we wanted: we don't mind having 3+ tile sizes greater than 1, it's when there are 3+ loop counts greater than 1 that we are in trouble when targetting the AIE array. i.e.

scf.forall ... (0, 0, 0, 0) to (1, 64, 12, 12) step (1, 32, 12, 12)

is fine at the thread level, as there is only 1 dimension with a loop count greater than 1. On the other hand

scf.forall ... (0, 0, 0, 0) to (1, 64, 12, 12) step (1, 1, 1, 1)

lands us in trouble: there are 3 dimensions with loop count greater than 1, even though all the tiles are size 1.

This PR fixes the logic.