Xilinx / mlir-aie

An MLIR-based toolchain for AMD AI Engine-enabled devices.
Other
308 stars 90 forks source link

Mat mul whole array implementation using tiler helper tools #1924

Closed hunhoffe closed 6 days ago

hunhoffe commented 1 week ago

The changes to mat mul whole array in https://github.com/Xilinx/mlir-aie/pull/1870 did not change implementation of the design at all, they just added some hooks to help with visualization.

The implementation changes in this PR actually use the TensorTiler2D to generate offsets/sizes/strides in a new copy of the mat mul whole array design.

This is useful for comparing how the code looks in each approach, but also the B tiles produced by the tiler are functionally equivalent to the B tiles produced by the original design, but the sizes/strides are different. I plan to use this branch to benchmark if the difference results in any performance changes between the two implementations.

Update: benchmark with sweep does not show performance difference, so I am going to try to merge.