nod-ai / iree-amd-aie

IREE plugin repository for the AMD AIE accelerator
Apache License 2.0
69 stars 30 forks source link

Enable 4x4 AIE cores for matmul #920

Open yzhang93 opened 11 hours ago

yzhang93 commented 11 hours ago

Change the tiling strategy to use 4x4 AIE cores for pack-peel pipeline.

Currently some tests failed with the changes:

  1. small matmul 8x32x16 failed with
    <unknown>:0: error: 'amdaie.logicalobjectfifo.from_memref' op should have at least one tile candidate
    <unknown>:0: note: see current operation: %18 = "amdaie.logicalobjectfifo.from_memref"(%17) : (memref<1x1x8x16xi32, 1 : i32>) -> !amdaie.logicalobjectfifo<memref<1x1x8x16xi32, 1 : i32>>

    Note: 16x32x16 works, so I temporarily changed one failed test to this shape.

matmul_truncf test seems to have failed with the same reason, ref https://github.com/nod-ai/iree-amd-aie/actions/runs/11964433065/job/33356990462?pr=920

  1. All packet flow tests failed with

    <unknown>:0: error: 'amdaie.flow' op ran out of packet IDs to assign
    <unknown>:0: note: see current operation: %503 = "amdaie.flow"(%501, %502) <{is_packet_flow = true, operandSegmentSizes = array<i32: 1, 1>}> : (index, index) -> index

    Temporarily disable these tests until fix.

  2. AIR pipeline failed all e2e tests, so I keep 2x2 array setting for AIR pipeline until @erwei-xilinx fix the issue.

  3. Chess tests failed in ci, ref https://github.com/nod-ai/iree-amd-aie/actions/runs/11964433065/job/33356990681?pr=920.

In addition, the tiling strategy needs some refactor. I would like to do that in a follow up PR together with a better way to select L1 tile sizes.