ROCm / triton

Development repository for the Triton language and compiler
MIT License
80 stars 23 forks source link

Fix vecSize for fp8 and int8 on MI300 #466

Closed zhanglx13 closed 6 months ago

zhanglx13 commented 6 months ago

This PR is actually doing more than I expected

Therefore, using mfma16 now has better performance on MI300X