Open zoranjovanovic-ns opened 10 months ago
this could be related: https://github.com/openai/triton/issues/2156
this could be related: openai#2156
@zoranjovanovic-ns Do not bother with investigation. I've checked and found that upstream also has this issues with fp16xfp16 -> fp32 FMA dot implementation, so I assume this issue is not related.
this could be related: openai#2156
@zoranjovanovic-ns Do not bother with investigation. I've checked and found that upstream also has this issues with fp16xfp16 -> fp32 FMA dot implementation, so I assume this issue is not related.
I put the same fix for @micmelesse some time ago in https://github.com/ROCmSoftwarePlatform/triton/tree/scxiao/added_processing_fp16_to_fp32_type_conversion.
This is just for code review purposes.
It should be used together with xla branch: https://github.com/ROCmSoftwarePlatform/xla/tree/rocm_triton_gemm_3_tmp