accel-sim / accel-sim-framework

This is the top-level repository for the Accel-Sim framework.
https://accel-sim.github.io
Other
303 stars 118 forks source link

Syntax error on mma.sync #4

Closed apuaaChen closed 4 years ago

apuaaChen commented 4 years ago

Hi! I'm trying to simulate the volta_tensorop_gemm.cu in cutlass.

I directly use the docker image provided here. And I have the gpgpusim.config of TITAN V at the same directory.

The .cu file is compiled with the code as follows /usr/local/cuda-11.0/bin/nvcc -std=c++11 -x cu -gencode arch=compute_70,code=compute_70 -cudart shared volta_tensorop_gemm.cu -I/accel-sim/host/cutlass/include -I/accel-sim/host/cutlass/tools/util/include -I/accel-sim/host/cutlass -o wmma_gemm

Then, I run the ./wmma_gemm, and I got the Syntax error as follows

wmma_gemm.1.sm_70.ptx:831 Syntax error:

mma.sync.aligned.m8n8k4.col.row.f32.f16.f16.f32 {%f779,%f780,%f781,%f782,%f783,%f784,%f785,%f786}, {%r2139,%r2140}, {%r2131,%r2132}, {%f4106,%f4105,%f4104,%f4103,%f4102,%f4101,%f4100,%f4099};
       ^

GPGPU-Sim PTX: finished parsing EMBEDDED .ptx file wmma_gemm.1.sm_70.ptx GPGPU-Sim PTX: loading globals with explicit initializers... GPGPU-Sim PTX: finished loading globals (0 bytes total). GPGPU-Sim PTX: loading constants with explicit initializers... done. GPGPU-Sim PTX: Loading PTXInfo from wmma_gemm.1.sm_70.ptx GPGPU-Sim PTX: Kernel '_ZN7cutlass9reference6device6kernel4GemmINS_9TensorRefINS_6half_tENS_6layout11ColumnMajorEEENS4_IS5_NS6_8RowMajorEEENS4_IfS9_EEffNS_11MatrixShapeILi4ELi4EEENS_12multiply_addIfffEENS_16NumericConverterIffLNS_15FloatRoundStyleE2EEEEEvNS_4gemm9GemmCoordET2_T_T0_SL_T1_SOT3' : regs=48, lmem=0, smem=0, cmem=444 wmma_gemm: cuda_api_object.h:82: void CUctx_st::add_ptxinfo(const char*, const gpgpu_ptx_sim_info&): Assertion `s != NULL' failed. Aborted (core dumped)

On the other hand, the basic_gemm.cu that doesn't use the tensor core can be successfully simulated.

So is there any way to solve this syntax error? Thanks!

mkhairy commented 4 years ago

Hello,

It seems you are using PTX execution, so, are you sure you are using the most recent GPGPU-sim 4.0?

We have tested CUTLASS v1.3 with GPGPU-sim 4.0 and it worked successfully. A compatible CUTLASS v1.3 comes with accel-sim app collection here. If you want to run the compatible CUTLASS v1.3, please follow the instructions on the accel-sim framework here. You are using the most updated cutlass 2.2 and there might new wmma instruction format in the cutlass 2.2 that we did not handle. So, please ensure to use GPGPU-sim 4.0 and the compatible CUTLASS v1.3.