migraphx-benchmark / AMDMIGraphX

AMD's graph optimization engine.
https://rocmsoftwareplatform.github.io/AMDMIGraphX/doc/html/
MIT License
0 stars 1 forks source link

Transpose -> GEMM issue #183

Closed mirza-halilcevic closed 5 months ago

mirza-halilcevic commented 6 months ago

When transposing the last two axes of a shape with 1, 1 as its last two strides, gemm does not correctly deduce that the input with that shape is transposed, due to it only checking the strides. This causes the rocBLAS implementation of gemm to fail. The issue doesn't appear when using MLIR.

@4 = hip::copy_to_gpu(x2,@2) -> float_type, {2, 2, 1}, {2, 1, 1}, target_id=0
@9 = transpose[permutation={0, 2, 1}](@4) -> float_type, {2, 1, 2}, {2, 1, 1}, target_id=0
@10 = gpu::gemm[alpha=1,beta=0,compute_fp32=0,trans_batch=0,solution_idx=0](@7,@9,@8) -> float_type, {2, 2, 2}, {4, 2, 1}, target_id=0