accel-sim / accel-sim-framework

This is the top-level repository for the Accel-Sim framework.
https://accel-sim.github.io
Other
273 stars 105 forks source link

Syntax error shf.l.wrap.b32 #298

Closed gFrancoCamilo closed 2 months ago

gFrancoCamilo commented 2 months ago

Hello! I'm trying to run this AES implementation using the Accel-Sim framework in PTX mode. I have followed the instructions to add it as an app described here.

However, when I run the application, I get the following error.

app.1.sm_52.ptx:695 Syntax error:

   shf.l.wrap.b32 %r107, %r106, %r106, 24;
      ^

GPGPU-Sim PTX: finished parsing EMBEDDED .ptx file app.1.sm_52.ptx
GPGPU-Sim PTX: loading globals with explicit initializers...
GPGPU-Sim PTX:     initializing '$str' ...  wrote 35 bytes
GPGPU-Sim PTX:     initializing '$str$1' ...  wrote 33 bytes
GPGPU-Sim PTX:     initializing '$str$2' ...  wrote 18 bytes
GPGPU-Sim PTX:     initializing '$str$3' ...  wrote 35 bytes
GPGPU-Sim PTX:     initializing '$str$4' ...  wrote 35 bytes
GPGPU-Sim PTX:     initializing '$str$5' ...  wrote 21 bytes
GPGPU-Sim PTX:     initializing '$str$6' ...  wrote 33 bytes
GPGPU-Sim PTX:     initializing '$str$7' ...  wrote 34 bytes
GPGPU-Sim PTX:     initializing '$str$8' ...  wrote 45 bytes
GPGPU-Sim PTX:     initializing '$str$9' ...  wrote 55 bytes
GPGPU-Sim PTX: finished loading globals (344 bytes total).
GPGPU-Sim PTX: loading constants with explicit initializers...  done.
app: cuda_api_object.h:82: void CUctx_st::add_ptxinfo(const char*, const gpgpu_ptx_sim_info&): Assertion `s != NULL' failed.
./justrun.sh: line 1: 20089 Aborted                 (core dumped) /root/accel-sim-framework/gpu-app-collection/src/..//bin/11.7/release/app

I saw that a similar error was posted on #4 but I have verified if I was running the most recent GPGPU-sim version, and it looks like I am (GPGPU-Sim Simulator Version 4.2.0 [build gpgpu-sim_git-commit-7dc99771_modified_0.0]). Do you have any suggestions for solving or circumventing this problem?

Some additional information:

Thank you!

JRPan commented 2 months ago

Interesting.

Looks like gpgpu-sim actually don't support shf at all. Not found in ptx.I.

It needs to be implemented.

JRPan commented 2 months ago

check https://github.com/accel-sim/gpgpu-sim_distribution/pull/70

gFrancoCamilo commented 2 months ago

I made the changes and it solved the shf instruction problem, but now I'm facing the following problem:

GPGPU-Sim PTX:  8 (potential) branch divergence @  PC=0x8918 (app.1.sm_52.ptx:5239) @%p7 bra $L__BB9_12;
GPGPU-Sim PTX:    immediate post dominator      @  PC=0x8a10 (app.1.sm_52.ptx:5317) ret;
GPGPU-Sim PTX: ... end of reconvergence points for _Z73counterWithOneTableExtendedSharedMemoryBytePermPartlyExtendedSBoxCihangirPjS_S_S_PyPh
GPGPU-Sim PTX: ... done pre-decoding instructions for '_Z73counterWithOneTableExtendedSharedMemoryBytePermPartlyExtendedSBoxCihangirPjS_S_S_PyPh'.
GPGPU-Sim PTX: pushing kernel '_Z73counterWithOneTableExtendedSharedMemoryBytePermPartlyExtendedSBoxCihangirPjS_S_S_PyPh' to stream 0, gridDim= (1024,1,1) blockDim = (1024,1,1)
GPGPU-Sim uArch: Shader 0 bind to kernel 1 '_Z73counterWithOneTableExtendedSharedMemoryBytePermPartlyExtendedSBoxCihangirPjS_S_S_PyPh'
GPGPU-Sim uArch: CTA/core = 2, limited by: threads shmem regs
GPGPU-Sim: Reconfigure L1 cache to 32KB
...
GPGPU-Sim uArch: Shader 64 bind to kernel 1 '_Z73counterWithOneTableExtendedSharedMemoryBytePermPartlyExtendedSBoxCihangirPjS_S_S_PyPh'
GPGPU-Sim uArch: Shader 65 bind to kernel 1 '_Z73counterWithOneTableExtendedSharedMemoryBytePermPartlyExtendedSBoxCihangirPjS_S_S_PyPh'
GPGPU-Sim uArch: Shader 66 bind to kernel 1 '_Z73counterWithOneTableExtendedSharedMemoryBytePermPartlyExtendedSBoxCihangirPjS_S_S_PyPh'
GPGPU-Sim uArch: Shader 67 bind to kernel 1 '_Z73counterWithOneTableExtendedSharedMemoryBytePermPartlyExtendedSBoxCihangirPjS_S_S_PyPh'
GPGPU-Sim uArch: Shader 68 bind to kernel 1 '_Z73counterWithOneTableExtendedSharedMemoryBytePermPartlyExtendedSBoxCihangirPjS_S_S_PyPh'
GPGPU-Sim uArch: Shader 69 bind to kernel 1 '_Z73counterWithOneTableExtendedSharedMemoryBytePermPartlyExtendedSBoxCihangirPjS_S_S_PyPh'
GPGPU-Sim uArch: Shader 70 bind to kernel 1 '_Z73counterWithOneTableExtendedSharedMemoryBytePermPartlyExtendedSBoxCihangirPjS_S_S_PyPh'
GPGPU-Sim uArch: Shader 71 bind to kernel 1 '_Z73counterWithOneTableExtendedSharedMemoryBytePermPartlyExtendedSBoxCihangirPjS_S_S_PyPh'
GPGPU-Sim uArch: Shader 72 bind to kernel 1 '_Z73counterWithOneTableExtendedSharedMemoryBytePermPartlyExtendedSBoxCihangirPjS_S_S_PyPh'
GPGPU-Sim uArch: Shader 73 bind to kernel 1 '_Z73counterWithOneTableExtendedSharedMemoryBytePermPartlyExtendedSBoxCihangirPjS_S_S_PyPh'
GPGPU-Sim uArch: Shader 74 bind to kernel 1 '_Z73counterWithOneTableExtendedSharedMemoryBytePermPartlyExtendedSBoxCihangirPjS_S_S_PyPh'
GPGPU-Sim uArch: Shader 75 bind to kernel 1 '_Z73counterWithOneTableExtendedSharedMemoryBytePermPartlyExtendedSBoxCihangirPjS_S_S_PyPh'
GPGPU-Sim uArch: Shader 76 bind to kernel 1 '_Z73counterWithOneTableExtendedSharedMemoryBytePermPartlyExtendedSBoxCihangirPjS_S_S_PyPh'
GPGPU-Sim uArch: Shader 77 bind to kernel 1 '_Z73counterWithOneTableExtendedSharedMemoryBytePermPartlyExtendedSBoxCihangirPjS_S_S_PyPh'
GPGPU-Sim uArch: Shader 78 bind to kernel 1 '_Z73counterWithOneTableExtendedSharedMemoryBytePermPartlyExtendedSBoxCihangirPjS_S_S_PyPh'
GPGPU-Sim uArch: Shader 79 bind to kernel 1 '_Z73counterWithOneTableExtendedSharedMemoryBytePermPartlyExtendedSBoxCihangirPjS_S_S_PyPh'
GPGPU-Sim uArch: Shader 7 bind to kernel 1 '_Z73counterWithOneTableExtendedSharedMemoryBytePermPartlyExtendedSBoxCihangirPjS_S_S_PyPh'
GPGPU-Sim uArch: Shader 3 bind to kernel 1 '_Z73counterWithOneTableExtendedSharedMemoryBytePermPartlyExtendedSBoxCihangirPjS_S_S_PyPh'
GPGPU-Sim uArch: Shader 11 bind to kernel 1 '_Z73counterWithOneTableExtendedSharedMemoryBytePermPartlyExtendedSBoxCihangirPjS_S_S_PyPh'
GPGPU-Sim uArch: Shader 15 bind to kernel 1 '_Z73counterWithOneTableExtendedSharedMemoryBytePermPartlyExtendedSBoxCihangirPjS_S_S_PyPh'
GPGPU-Sim uArch: Shader 19 bind to kernel 1 '_Z73counterWithOneTableExtendedSharedMemoryBytePermPartlyExtendedSBoxCihangirPjS_S_S_PyPh'
GPGPU-Sim uArch: Shader 23 bind to kernel 1 '_Z73counterWithOneTableExtendedSharedMemoryBytePermPartlyExtendedSBoxCihangirPjS_S_S_PyPh'
GPGPU-Sim uArch: Shader 27 bind to kernel 1 '_Z73counterWithOneTableExtendedSharedMemoryBytePermPartlyExtendedSBoxCihangirPjS_S_S_PyPh'
GPGPU-Sim uArch: Shader 31 bind to kernel 1 '_Z73counterWithOneTableExtendedSharedMemoryBytePermPartlyExtendedSBoxCihangirPjS_S_S_PyPh'
GPGPU-Sim uArch: Shader 35 bind to kernel 1 '_Z73counterWithOneTableExtendedSharedMemoryBytePermPartlyExtendedSBoxCihangirPjS_S_S_PyPh'
GPGPU-Sim uArch: Shader 39 bind to kernel 1 '_Z73counterWithOneTableExtendedSharedMemoryBytePermPartlyExtendedSBoxCihangirPjS_S_S_PyPh'
GPGPU-Sim uArch: Shader 43 bind to kernel 1 '_Z73counterWithOneTableExtendedSharedMemoryBytePermPartlyExtendedSBoxCihangirPjS_S_S_PyPh'
GPGPU-Sim uArch: Shader 47 bind to kernel 1 '_Z73counterWithOneTableExtendedSharedMemoryBytePermPartlyExtendedSBoxCihangirPjS_S_S_PyPh'
GPGPU-Sim uArch: Shader 51 bind to kernel 1 '_Z73counterWithOneTableExtendedSharedMemoryBytePermPartlyExtendedSBoxCihangirPjS_S_S_PyPh'
GPGPU-Sim uArch: Shader 68 bind to kernel 1 '_Z73counterWithOneTableExtendedSharedMemoryBytePermPartlyExtendedSBoxCihangirPjS_S_S_PyPh'
GPGPU-Sim uArch: Shader 55 bind to kernel 1 '_Z73counterWithOneTableExtendedSharedMemoryBytePermPartlyExtendedSBoxCihangirPjS_S_S_PyPh'
GPGPU-Sim uArch: Shader 72 bind to kernel 1 '_Z73counterWithOneTableExtendedSharedMemoryBytePermPartlyExtendedSBoxCihangirPjS_S_S_PyPh'
GPGPU-Sim uArch: Shader 76 bind to kernel 1 '_Z73counterWithOneTableExtendedSharedMemoryBytePermPartlyExtendedSBoxCihangirPjS_S_S_PyPh'
GPGPU-Sim uArch: Shader 59 bind to kernel 1 '_Z73counterWithOneTableExtendedSharedMemoryBytePermPartlyExtendedSBoxCihangirPjS_S_S_PyPh'
GPGPU-Sim uArch: Shader 63 bind to kernel 1 '_Z73counterWithOneTableExtendedSharedMemoryBytePermPartlyExtendedSBoxCihangirPjS_S_S_PyPh'
GPGPU-Sim uArch: Shader 79 bind to kernel 1 '_Z73counterWithOneTableExtendedSharedMemoryBytePermPartlyExtendedSBoxCihangirPjS_S_S_PyPh'
./justrun.sh: line 1: 17008 Aborted                 (core dumped) /root/accel-sim-framework/gpu-app-collection/src/..//bin/11.7/release/app

Do you have an idea of what might be causing this or how to solve it? As I said, the code works when I use the libcudart.so in /usr/local/cuda/lib64.

cesar-avalos3 commented 2 months ago

When you changed cudaMallocManaged into cudaMalloc, did you also change all the times the CPU tried to directly change a device pointer?

gFrancoCamilo commented 2 months ago

I made a mistake in the original post. I replaced the cudaMallocManaged calls for cudaMallocHost instead of cudaMalloc.

I used gdb to figure out what was causing the error from my previous comment. Apparently, the code was getting stuck in some printf calls and would abort.

GPGPU-Sim PTX: finding reconvergence points for 'vprintf'...
threadIndex : 1048575
GPGPU-Sim PTX: PDOM analysis already done for vprintf
                                                                                                                                                                                                                    Thread 2 "app" received signal SIGABRT, Aborted.
[Switching to Thread 0x7efd06fe7640 (LWP 34024)]
__pthread_kill_implementation (no_tid=0, signo=6, threadid=139625209165376) at ./nptl/pthread_kill.c:44
44      ./nptl/pthread_kill.c: No such file or directory.
(gdb) bt
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=139625209165376) at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=6, threadid=139625209165376) at ./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=139625209165376, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3  0x00007efd07234476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4  0x00007efd0721a7f3 in __GI_abort () at ./stdlib/abort.c:79
#5  0x00007efd07836fa7 in my_cuda_printf (fmtstr=fmtstr@entry=0x7efce3943a50 "Plaintext   : %08x %08x %08x %08x\n", arg_list=arg_list@entry=0x7efce91a5100 "\250\366C2\215\060Z\210") at cuda_device_printf.cc:56
#6  0x00007efd07837259 in gpgpusim_cuda_vprintf (pI=0x55968e84a790, thread=0x7efcfeaca9a0, target_func=<optimized out>) at cuda_device_printf.cc:115
#7  0x00007efd0783d076 in call_impl (pI=pI@entry=0x55968e84a790, thread=thread@entry=0x7efcfeaca9a0) at instructions.cc:2184
#8  0x00007efd07827bc7 in ptx_thread_info::ptx_exec_inst (this=0x7efcfeaca9a0, inst=..., lane_id=lane_id@entry=31) at /root/accel-sim-framework/gpu-simulator/gpgpu-sim/src/cuda-sim/opcodes.def:58
#9  0x00007efd0799d772 in core_t::execute_warp_inst_t (this=this@entry=0x55968005c5a0, inst=..., warpId=63, warpId@entry=4294967295) at abstract_hardware_model.cc:1192
#10 0x00007efd078cf59f in exec_shader_core_ctx::func_exec_inst (inst=..., this=0x55968005c5a0) at shader.cc:1023
#11 shader_core_ctx::issue_warp (this=0x55968005c5a0, pipe_reg_set=..., next_inst=0x55968e84a790, active_mask=std::bitset = {...}, warp_id=63, sch_id=<optimized out>) at shader.cc:1046
#12 0x00007efd078c9760 in scheduler_unit::cycle (this=0x5596800fba80) at shader.cc:1408
#13 0x00007efd078c9cf6 in shader_core_ctx::issue (this=0x55968005c5a0) at shader.cc:1118
#14 0x00007efd078d9f46 in shader_core_ctx::cycle (this=0x55968005c5a0) at shader.cc:3627
#15 0x00007efd078d9fc0 in simt_core_cluster::core_cycle (this=0x55968005c520) at shader.cc:4437
#16 0x00007efd0789bc23 in gpgpu_sim::cycle (this=0x55967dc8a280) at gpu-sim.cc:1957
#17 0x00007efd079a5615 in gpgpu_sim_thread_concurrent (ctx_ptr=0x55967dc71d70) at gpgpusim_entrypoint.cc:127
#18 0x00007efd07286ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#19 0x00007efd07317a04 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:100

The backtrace points to this if statement in the cuda_device_printf.cc file, which fails to run the line printf("Plaintext : %08x %08x %08x %08x\n", pt0Init, pt1Init, pt2Init, pt3Init);.

I commented the printf statements, which resolved the warnings and fixed the problem.

Thank you for your help!