migraphx-benchmark / AMDMIGraphX

AMD's graph optimization engine.
https://rocmsoftwareplatform.github.io/AMDMIGraphX/doc/html/
MIT License
0 stars 1 forks source link

Multibroadcast -> GEMM issue #184

Closed mirza-halilcevic closed 5 months ago

mirza-halilcevic commented 6 months ago

When one of the last two shape dimensions is broadcasted, eliminate_contiguous removes a contiguous that should remain before gemm, due to gemm's compute_shape not handling zeros in the last two strides. This causes the rocBLAS implementation of gemm to fail. However, this is not an issue when using MLIR dot.

@9 = multibroadcast[out_lens={2, 2, 2},out_dyn_dims={}](@7) -> float_type, {2, 2, 2}, {2, 0, 1}, target_id=0
@10 = gpu::gemm[alpha=1,beta=0,compute_fp32=0,trans_batch=0,solution_idx=0](@4,@9,@8) -> float_type, {2, 2, 2}, {4, 2, 1}, target_id=0