Closed jtuyls closed 2 months ago
Added initial support with: https://github.com/nod-ai/iree-amd-aie/pull/512, but this PR doesn't add the new DMA loop subsumption pass to the objectFifo lowering pipeline as making this work in E2E needs some changes to how BD IDs are assigned to amdaie.npu.dma_memcpy_nd operations. Earlier, all operations would use id == 0, which is only valid if the control code is executed synchronously 'operation by operation' (every DMA operation was directly followed by a wait). As this is not the case anymore, some new logic is needed to assign ids to the amdaie.npu.dma_memcpy_nd operations. This will be addressed in a follow-up PR.
Done with:
DMA loop iteration subsumption tries to move
scf.for
loops inside the DMA operations by updating the DMA access patterns and hoisting them out of the loop. There are a couple of reasons for needing this:Example input IR for this transformation:
Expected output IR:
This results in a single programming of the DMAs needed to implement this
amdaie.circular_dma_cpy_nd
operation and no control code is needed on the uController side to update the DMAs.