Xilinx / mlir-aie

An MLIR-based toolchain for AMD AI Engine-enabled devices.
Other
288 stars 82 forks source link

Make aiex.npu.dma_memcpy_nd d0 stride explicit #1584

Closed fifield closed 3 months ago

fifield commented 3 months ago

The stride of the inner dimension is assumed to be one for npu.dma_memcpy_nd (e.g. here). This PR makes it an explicit operand.

fifield commented 3 months ago

Does not seem to get along with #1580 with errors like error: unknown: 'aiex.npu.dma_memcpy_nd' op Stride 0 is 1 elements * 1 bytes = 1 bytes, which is not divisible by 4.

resolved.

fifield commented 3 months ago

Would it be hard to make this so the lowest stride is not fixed to be 1? There would be use cases for that as well, e.g. to transpose a MxN matrix: sizes=[N, M] strides=[1, N]

yes, this is still WIP. Step one was to make the implicit 1 into explicit 1 and not break anything. Step 2 is to cleanup and make sure other strides work.

fifield commented 3 months ago

The actions seem to be stuck, closing.

andrej commented 3 months ago

I believe there was a GitHub outage yesterday. Seems fixed now according to githubstatus.com. I guess we run the actions locally on our server, so unlikely that was the cause the actions got stuck. But maybe still worth a retry.