ROCm / triton

Development repository for the Triton language and compiler
MIT License
92 stars 29 forks source link

[MFMA] FP8 and BF8 support #355

Closed binarman closed 1 year ago

binarman commented 1 year ago

This PR adds support of fp8 and bf8 instructions

scxiao commented 1 year ago

LGTM. Anyone else has other comments? Thanks

binarman commented 1 year ago

@zhanglx13

Is this PR supposed to fix the different i8 mfma instructions on MI300 and non-MI300 gpus?

No, it is fixed by @scxiao in separate PR

About mfma granularity check

I had other idea in my mind.

I was thinking about full instruction selections during AccelerateMatmul pass and simplification of python (remove checks and casts there) and code generation parts(do not select instruction, simply take it's type from mfma encoding).

scxiao commented 1 year ago

Approved. I think @binarman's suggestions make sense to me

  • Simplify python semantic checks by removing checks for mfma instructions
  • Do mfma instructions selection in AccelerateAMDMatmul pass and do all checks there.

All these changes can be done is a future PR. @scxiao Do you think we should merge this one (#355) first before merging #368? Or should we merge #368 into this one (#355) and review again?

Yes, this PR should be merged first. Then #368. The PR #357 has pytorch dependency, so cannot be merged for now.

scxiao commented 1 year ago

@binarman, could you please take a look at the CI build error? so we can get it merged.

alefimov-amd commented 1 year ago

@scxiao Yes, this is a problem with our ci infrastructure, I've restarted testing, It should work now