Syntax error on mma.sync

Hi! I'm trying to simulate the volta_tensorop_gemm.cu in cutlass.

I directly use the docker image provided here. And I have the gpgpusim.config of TITAN V at the same directory.

The .cu file is compiled with the code as follows /usr/local/cuda-11.0/bin/nvcc -std=c++11 -x cu -gencode arch=compute_70,code=compute_70 -cudart shared volta_tensorop_gemm.cu -I/accel-sim/host/cutlass/include -I/accel-sim/host/cutlass/tools/util/include -I/accel-sim/host/cutlass -o wmma_gemm

Then, I run the ./wmma_gemm, and I got the Syntax error as follows

wmma_gemm.1.sm_70.ptx:831 Syntax error:

mma.sync.aligned.m8n8k4.col.row.f32.f16.f16.f32 {%f779,%f780,%f781,%f782,%f783,%f784,%f785,%f786}, {%r2139,%r2140}, {%r2131,%r2132}, {%f4106,%f4105,%f4104,%f4103,%f4102,%f4101,%f4100,%f4099};
       ^

GPGPU-Sim PTX: finished parsing EMBEDDED .ptx file wmma_gemm.1.sm_70.ptx GPGPU-Sim PTX: loading globals with explicit initializers... GPGPU-Sim PTX: finished loading globals (0 bytes total). GPGPU-Sim PTX: loading constants with explicit initializers... done. GPGPU-Sim PTX: Loading PTXInfo from wmma_gemm.1.sm_70.ptx GPGPU-Sim PTX: Kernel '_ZN7cutlass9reference6device6kernel4GemmINS_9TensorRefINS_6half_tENS_6layout11ColumnMajorEEENS4_IS5_NS6_8RowMajorEEENS4_IfS9_EEffNS_11MatrixShapeILi4ELi4EEENS_12multiply_addIfffEENS_16NumericConverterIffLNS_15FloatRoundStyleE2EEEEEvNS_4gemm9GemmCoordET2_T_T0_SL_T1_SOT3' : regs=48, lmem=0, smem=0, cmem=444 wmma_gemm: cuda_api_object.h:82: void CUctx_st::add_ptxinfo(const char*, const gpgpu_ptx_sim_info&): Assertion `s != NULL' failed. Aborted (core dumped)

On the other hand, the basic_gemm.cu that doesn't use the tensor core can be successfully simulated.

So is there any way to solve this syntax error? Thanks!

accel-sim / accel-sim-framework

Syntax error on mma.sync #4