triton-lang / triton

Development repository for the Triton language and compiler
https://triton-lang.org/
MIT License
13.47k stars 1.66k forks source link

[LAYOUTS] [BE] Simplify Ampere/Hopper paths introduced in #5189 #5200

Closed lezcano closed 4 hours ago

lezcano commented 2 days ago

We simplify the implementation of getElemsPerThread and strengthen the preconditions of getRepForOperand.

More generally, we should try to minimise the calls to isAmpere and isHopper throughout the codebase. I'll do a pass fixing many of these once we land LLs for ldmatrix and Hopper.

lezcano commented 1 day ago

@Jokeren This one's ready for review. Now the formulas are particularly clean and there's no special-casing for Hopper or Ampere

Having clean formulas shows that the edge-case M=8 for opIdx=0 was wrong, but well, all this will be fixed by LLs.

lezcano commented 4 hours ago

I am hitting some issues with mma and dot Hopper layouts for a different PR. I'm going to merge this PR and I'll add a note to use this logic in all DistributedLayouts at a later stage.