Open LuoYuanke opened 2 months ago
@LuoYuanke Not familiar with ptx, but I found that the warnings are gone with '-O2' instead of '-O0'.
Yes, it seems when compiling with -O0
, compiler would generate "WARPGROUP.DEPBAR.LE gsb0, 0x0" to wait for wgmma finish. I think that's why compiler emit warning that the wgmma instructions are serialized.
Hi I got the warning when I compiling ptx code with command
ptxas --gpu-name sm_90a -O0 wgmma_rs.ptx
. Could someone help to explain what does it means and how to improve the ptx code to eliminate the warning?Duplication steps:
Run
ptxas --gpu-name sm_90a -O0 wgmma_rs.ptx
and got below warningptxas info : (C7515) Potential Performance Loss: wgmma.mma_async instructions are serialized due to non wgmma instructions defining accumulator registers of a wgmma between start and end of the pipeline stage in the function 'selp_b16'
The source code is as below.