Vectorized matmul performance regression - transfer reads

nod-ai / iree-amd-aie

IREE plugin repository for the AMD AIE accelerator

Apache License 2.0

69 stars 30 forks source link

Closed jtuyls closed 2 weeks ago

jtuyls commented 2 weeks ago

We're seeing performance regression on vectorized matmul, likely caused by the following PR: https://github.com/nod-ai/iree-amd-aie/pull/867, see table below:

Matmul problem size: 512x512x4096 (MxKxN) Array configuration: 2x2 Vectorization or ukernel or scalar: Vectorization

Commit	Latency (us)
2086718	42513
fded307	20101

@newling

newling commented 2 weeks ago

Ok, thanks for triaging @jtuyls. I can definitely believe that #867 introduced the regression. I'll take a look this morning.

newling commented 2 weeks ago

A bit concerning that the regression was so high, suggests there'll be more work than I expected to get convolution performance up. Future problem.