Closed zhanglx13 closed 6 months ago
This PR is actually doing more than I expected
Therefore, using mfma16 now has better performance on MI300X
This PR is actually doing more than I expected
Therefore, using mfma16 now has better performance on MI300X