nod-ai / iree-amd-aie

IREE plugin repository for the AMD AIE accelerator
Apache License 2.0
69 stars 30 forks source link

[AIEVec] Allow unit extent dimensions in canonicalize-vector-for-aievec #807

Closed newling closed 1 month ago

newling commented 1 month ago

There's a pass we run --canonicalize-vector-for-aievec which (as the name suggests) massages vector dialect operations to make them lowerable to aievec. One of the things the pass does is 'flatten' vector.transfer_reads. For example it converts

%26 = vector.transfer_read %reinterpret_cast_1[%c0, %20, %24, %22, %c0], %cst {in_bounds = [true, true]} : memref<1x3x4x6x8xbf16>, vector<4x8xbf16>

which has a rank-2 vector operand to

...
%collapse_shape = memref.collapse_shape %reinterpret_cast_1 [[0], [1], [2], [3, 4]] : memref<1x3x4x6x8xbf16> into memref<1x3x4x48xbf16>
 %27 = vector.transfer_read %collapse_shape[%c0, %20, %24, %26], %cst {in_bounds = [true]} : memref<1x3x4x48xbf16>, vector<32xbf16>
 %28 = vector.shape_cast %27 : vector<32xbf16> to vector<4x8xbf16>

Subsequent compiler passes fail if there are any transfer_reads (or transfer_writes) which have a vector operand of rank > 1.

With my current convolution pipeline (using the linalg-fold-unit-extent-dims pass with useRankReducingSlices = true as suggested by @MaheshRavishankar to simplify the compiler by avoiding collapse_shape/expand_shape) we get a transfer_read like

%61 = vector.transfer_read %reinterpret_cast_3[%c0, %c0, %c0, %c0, %c0], %cst_0 {in_bounds = [true, true], permutation_map = #map} : memref<1x1x4x1x4xf32>, vector<4x4xf32> 

where

#map = affine_map<(d0, d1, d2, d3, d4) -> (d2, d4)>

Before this PR, the pass --canonicalize-vector-for-aievec failed to convert this into a transfer_read with a rank-1 vector. This is because the map #map is not a "minor identity" affine map which the pattern previously required. So this PR relaxes the constraint that the map must be a minor identity map, allowing unit dimensions to slip through.

It uses the upstream pass to flatten reads and writes, and adds a new pattern to make unflatten (unsqueeze) reads and writes whose permutation maps are not minor identity.

Unfortunately @jsetoain is going to be away for a while, so @makslevental would you mind reviewing this?