Xilinx / mlir-air

MIT License
76 stars 26 forks source link

Single Core DMA/Channel Matrix Scalar Add Examples Broken #623

Closed hunhoffe closed 3 months ago

hunhoffe commented 3 months ago

I'm working on my draft PR https://github.com/Xilinx/mlir-air/pull/621

I rebased my branch to master after https://github.com/Xilinx/mlir-air/pull/620 was merged in. However, after that rebase, the two examples that previously worked (single_core_dma and single_core_channel) no longer work.

You can replicate the working versions in branch debugging-matrix-scalar-add (which diverges from main at bcbfed5c instead of HEAD=45592176) with:

cd programming_examples/matrix_scalar_add/single_core_dma
make

and

cd programming_examples/matrix_scalar_add/single_core_channel
make

You can replicate the failing tests in branch minimal-matrix-scalar-add with the same commands.

For the single_core_dma example, the files aie.air.mlir and placed.air.mlir are identical between the passing/failing cases. The file npu.air.mlir has the following diff:

$ diff broken_single_core_dma_build/air_project/npu.air.mlir working_single_core_dma_build/air_project/npu.air.mlir 
63,64c63,64
<       aiex.npu.dma_memcpy_nd(0, 0, %arg0[0, 0, 0, 512][1, 1, 8, 16][0, 0, 32]) {id = 2 : i64, metadata = @airMemcpyId3} : memref<32x16xi32>
<       aiex.npu.dma_memcpy_nd(0, 0, %arg0[0, 0, 0, 528][1, 1, 8, 16][0, 0, 32]) {id = 3 : i64, metadata = @airMemcpyId3} : memref<32x16xi32>
---
>       aiex.npu.dma_memcpy_nd(0, 0, %arg0[0, 0, 16, 0][1, 1, 8, 16][0, 0, 32]) {id = 2 : i64, metadata = @airMemcpyId3} : memref<32x16xi32>
>       aiex.npu.dma_memcpy_nd(0, 0, %arg0[0, 0, 16, 16][1, 1, 8, 16][0, 0, 32]) {id = 3 : i64, metadata = @airMemcpyId3} : memref<32x16xi32>
67,68c67,68
<       aiex.npu.dma_memcpy_nd(0, 0, %arg1[0, 0, 0, 512][1, 1, 8, 16][0, 0, 32]) {id = 6 : i64, metadata = @airMemcpyId4} : memref<32x16xi32>
<       aiex.npu.dma_memcpy_nd(0, 0, %arg1[0, 0, 0, 528][1, 1, 8, 16][0, 0, 32]) {id = 7 : i64, metadata = @airMemcpyId4} : memref<32x16xi32>
---
>       aiex.npu.dma_memcpy_nd(0, 0, %arg1[0, 0, 16, 0][1, 1, 8, 16][0, 0, 32]) {id = 6 : i64, metadata = @airMemcpyId4} : memref<32x16xi32>
>       aiex.npu.dma_memcpy_nd(0, 0, %arg1[0, 0, 16, 16][1, 1, 8, 16][0, 0, 32]) {id = 7 : i64, metadata = @airMemcpyId4} : memref<32x16xi32>

The diff for the single_core_channel example is essentially the same.

Let me know if more information is needed!

erwei-xilinx commented 3 months ago

Thanks for investigating the source of the issue. I made a PR which reverts some of the changes causing this issue. https://github.com/Xilinx/mlir-air/pull/626

hunhoffe commented 3 months ago

Fixed, thank you!