Xilinx / mlir-aie

An MLIR-based toolchain for AMD AI Engine-enabled devices.
Other
288 stars 82 forks source link

npu_dma_memcpy `offset` is wrong #1578

Closed andrej closed 2 months ago

andrej commented 3 months ago

Assume a microkernel that just writes 11 repeatedly into its output buffer, and that output buffer is transferred via fifo_c to the shim. Assume the data type of the output buffer is i8 and its length is 8 bytes.

For a data transfer like the following (pay attention to the offset):

aiex.npu.dma_memcpy_nd(0, 0, %arg0[0, 0, 0, 1][1, 1, 1, 8][0, 0, 0]) {id = 0 : i64, metadata = @fifo_c} : memref<8xi8>
//                                          ^

With offset=1, we would expect output like the following (assuming the output buffer is zero-initialized):

   0   11   11   11   11   11   11   11

Instead, we get:

  11   11   11   11   11   11   11   11

The results are identical (i.e. the offset seems to have no effect) for offset=1, offset=2, ... all the way up to offset=15. Once we set offset=16, we get:

   0    0    0    0   11   11   11   11 

At this point, we would expect

   0    0    0    0    0    0    0    0

as this offset is larger than the buffer, so the buffer should not be touched by the transfer.

Similar issues, although with different numbers, arise with i16. Once we use i32, everything works as expected. I suspect this is a rounding error somewhere, where the offset is erroneously multiplied by data type size then divided by four. I think offsets may need to be set in bytes.

I have attached a minimal example that should be easy to reproduce. Play with offset_c on line 21 in aie2.py.

minimal.zip

cc @pvasireddy-amd

andrej commented 3 months ago

From my initial tests it seems like the offset is indeed expressed in bytes, but the value has to be 4-byte aligned.

That is, an output like

   0   11   11   11   11   11   11   11

is not possible since that offset is not 4-byte aligned, but if you want to get

   0    0    0    0   11   11   11   11 

you can, by setting offset=4.

I don't know if there's documentation somewhere to back this up. (@jgmelber ?)

I'll make a PR with the fix once I added a test case.

stephenneuendorffer commented 3 months ago

I thought we had a validator for this case?

andrej commented 3 months ago

This validator? It currently does not seem to check offsets.

Also, I believe after #1538, these lines were forgotten to be updated. (It might be good to have a test case for these error messages.)

stephenneuendorffer commented 3 months ago

I see... seems like an oversight to me.. There should definitely be test cases for the error messages. While you're poking in there, the llvm::report_fatal_error's should be replaced with emitOpErrors...

andrej commented 3 months ago

Sounds good, will do

andrej commented 2 months ago

1580 fixed this