ROCm / triton

Development repository for the Triton language and compiler
MIT License
83 stars 27 forks source link

[tool] Added a script to print occupancy info #450

Closed zhanglx13 closed 7 months ago

zhanglx13 commented 8 months ago

This PR adds a script that can print out occupancy related information of a kernel.

Example usage:

./script/amd/occ.sh ./perf-kernels/06-fused-attention-fwd-transV.py

Output

LDS:  32768, num_warps:  4
VGPRS: 256 (spill: 3)
occ: 2 waves/SIMD (occ_LDS: 2, occ_vgpr: 2)
perf: 127.4 tflops

How it works