ROCm / Tensile

Stretching GPU performance for GEMMs and tensor contractions.
MIT License
208 stars 142 forks source link

[Feature]: More sample benchmark config.yaml needed to support bf16 and I8II on RDNA arches #1955

Open likelovewant opened 1 month ago

likelovewant commented 1 month ago

Suggestion Description

Based on the wiki guide ,I am able to gennerate some files . based on the sample conig.yaml on here for Navi21 . However , it's lack of some files for generate files like:

navi21_Cijk_Ailk_Bjlk_BBS_BH.yaml, 
navi21_Cijk_Ailk_Bjlk_BBS_BH_GB.yaml
navi21_Cijk_Alik_Bjlk_I8II_BH.yaml,
navi21_Cijk_Alik_Bljk_I8II_BH.yaml

should be name like this , rocblas_hpa_bfloat16_gemm_nn_inc1_asm_full.yaml, however, it's too old to support new released arch . like RDNA 3. I could not able to use this sample config file to generated benchmark file .it's stop in somewhere or unsupported due to it's only supported the older arch. based on the rule , those config file may name like this

rocblas_hpa_bfloat16_gemm_nn_asm_full.yaml
rocblas_hpa_bfloat16_gemm_nt_asm_full.yaml
rocblas_hpa_bfloat16_gemm_tn_asm_full.yaml
rocblas_hpa_bfloat16_gemm_tt_asm_full.yaml

To generate files Cijk_Ailk_Bjlk_BBS_BH_GB.yaml

and

rocblas_bfloat16_gemm_nn_asm_full.yaml
rocblas_bfloat16_gemm_nt_asm_full.yaml
rocblas_bfloat16_gemm_tn_asm_full.yaml
rocblas_bfloat16_gemm_tt_asm_full.yaml

To generate fiels for Cijk_Ailk_Bjlk_BBS_BH.yaml

also igemm_nn_asm_full.yaml

rocblas_igemm_nt_asm_full.yaml
rocblas_igemm_tn_asm_full.yaml
rocblas_igemm_tn_asm_full.yaml
rocblas_igemm_tt_asm_full.yaml

To generate files Cijk_Ailk_Bjlk_I8II_BH.yaml

Last the other four filein nn, nt , tn, tt forrocblas_hpa_igemm_asm_full.yaml

To generate files Cijk_Ailk_Bjlk_I8II_BH_GB.yaml

Try to use those available file in tensile/config directory to benchmark file . no one of it supported on my arch . correct me , if there is information I didn't understand .

Once those file available , I think many people are able to generate optimized tensile logic and many unsupported arches are able to running on rocm by community contribution.

Operating System

Ubuntu and Windows

GPU

navi34, navi 22,navi23.... many more

ROCm Component

Rocblas