Question: I hit cases where matmul N dimension is too narrow:
[cpu-materialize-encoding]: tile (16, 16, 2) is skipped because it is not valid for upper_bound (1, 1, 16)
[cpu-materialize-encoding]: tile (8, 16, 2) is skipped because it is not valid for upper_bound (1, 1, 16)
[cpu-materialize-encoding]: tile (4, 16, 2) is skipped because it is not valid for upper_bound (1, 1, 16)
[cpu-materialize-encoding]: tile (2, 16, 2) is skipped because it is not valid for upper_bound (1, 1, 16)
[cpu-materialize-encoding]: tile (1, 16, 2) is skipped because it is not valid for upper_bound (1, 1, 16)
In such cases, do we assume the input is illegal? because we cannot find legal tiles
In SetEncoding for matmuls, when the M or N dimension is small, we may set the matmul_narrow_{M,N} attribute on the EncodingAttr, which has the effect of limiting padding in that dimension. Then, later, in MaterializeEncoding, we may transpose the matmul to reduce the matmul_narrow_ case to transpose_narrow_M, and we select an appropriate tile size for the small M value. Then that allows picking up a corresponding narrow-M code path in the mmt4d ukernel. That tile size is only narrow in the M dimension, it is generic large in the N dimension.
But what happens if both matmul_narrow_{M,N} are set? Then, we have limited padding in both dimensions. The situation is still the same after transposing, and so we now can't use a narrow-M tile, because that tile is for generic large-N and now we also have narrow-N.
By the time we reach MaterializeEncoding, it's too late to fix that up, since we have already limited padding (and buffer allocations) in both M and N dimensions. So this must be dealt with at SetEncoding time.
=> Solution: in SetEncoding, we should not set both matmul_narrow_{M,N} encoding attributes. If both M and N are small static values for which we would normally set the attribute, we should compare them and only set matmul_narrow_{M,N} for the smaller of the two. For example, if we would normally set matmul_narrow_M=4 and matmul_narrow_N=1 then in fact we should only set matmul_narrow_N=1 .
If there is a tie, that is if we would set both matmul_narrow_{M,N} to the same value, then we should only set matmul_narrow_M. Avoiding the transposition business in that case.
@lialan reports:
In
SetEncoding
for matmuls, when the M or N dimension is small, we may set thematmul_narrow_{M,N}
attribute on theEncodingAttr
, which has the effect of limiting padding in that dimension. Then, later, in MaterializeEncoding, we may transpose the matmul to reduce thematmul_narrow_
case totranspose_narrow_M
, and we select an appropriate tile size for the small M value. Then that allows picking up a corresponding narrow-M code path in themmt4d
ukernel. That tile size is only narrow in the M dimension, it is generic large in the N dimension.But what happens if both
matmul_narrow_{M,N}
are set? Then, we have limited padding in both dimensions. The situation is still the same after transposing, and so we now can't use a narrow-M tile, because that tile is for generic large-N and now we also have narrow-N.By the time we reach MaterializeEncoding, it's too late to fix that up, since we have already limited padding (and buffer allocations) in both M and N dimensions. So this must be dealt with at SetEncoding time.
=> Solution: in SetEncoding, we should not set both
matmul_narrow_{M,N}
encoding attributes. If both M and N are small static values for which we would normally set the attribute, we should compare them and only setmatmul_narrow_{M,N}
for the smaller of the two. For example, if we would normally setmatmul_narrow_M=4
andmatmul_narrow_N=1
then in fact we should only setmatmul_narrow_N=1
.If there is a tie, that is if we would set both
matmul_narrow_{M,N}
to the same value, then we should only setmatmul_narrow_M
. Avoiding the transposition business in that case.