chapel-lang / chapel

a Productive Parallel Programming Language
https://chapel-lang.org
Other
1.8k stars 422 forks source link

Prototype GPU support with -g debug flag generates invalid PTX #19773

Open milthorpe opened 2 years ago

milthorpe commented 2 years ago

Summary of Problem

Compiling with CHPL_LOCALE_MODEL=gpu chpl -g results in a ptxas error on invalid PTX.

Steps to Reproduce

Compiling with Jacobi GPU example with the debug flag -g results in an error when running ptxas on the generated code:

(base) milthorpe@whale:~/chapel$ CHPL_LAUNCHER=none CHPL_COMM=none CHPL_LOCALE_MODEL=gpu chpl -g --gpu-arch sm_62 jacobi-gpu.chpl --ldflags -v --print-commands --savec tmp_jacobi
warning: The prototype GPU support implies --no-checks. This may impact debuggability. To suppress this warning, compile with --no-checks explicitly
<internal clang code generation> -I/home/milthorpe/chapel/modules/standard -I/home/milthorpe/chapel/modules/packages -I/home/milthorpe/chapel/runtime/include/localeModels/gpu -I/home/milthorpe/chapel/runtime/include/localeModels -I/home/milthorpe/chapel/runtime/include/comm/none -I/home/milthorpe/chapel/runtime/include/comm -I/home/milthorpe/chapel/runtime/include/tasks/qthreads -I/home/milthorpe/chapel/runtime/include -I/home/milthorpe/chapel/runtime/include/qio -I/home/milthorpe/chapel/runtime/include/atomics/cstdlib -I/home/milthorpe/chapel/runtime/include/mem/jemalloc -I/home/milthorpe/chapel/third-party/utf8-decoder -DHAS_GPU_LOCALE -I/home/milthorpe/chapel/runtime/include/gpu/cuda -DCHPL_JEMALLOC_PREFIX=chpl_je_ -I/home/milthorpe/chapel/third-party/gmp/install/linux64-x86_64-native-llvm-none/include -I/home/milthorpe/chapel/third-party/hwloc/install/linux64-x86_64-native-llvm-none-gpu/include -I/home/milthorpe/chapel/third-party/qthread/install/linux64-x86_64-native-llvm-none-gpu-jemalloc-bundled/include -I/home/milthorpe/chapel/third-party/jemalloc/install/target/linux64-x86_64-native-llvm-none/include -I/home/milthorpe/chapel/third-party/re2/install/linux64-x86_64-native-llvm-none/include -I. -Itmp_jacobi -g -DCHPL_DEBUG -I/home/milthorpe/chapel/modules/internal -DCHPL_GEN_CODE -pthread -I/usr/local/cuda/include --std=c++11 -x cuda --cuda-gpu-arch=sm_62 -include sys_basic.h -include tmp_jacobi/command-line-includes.h -include llvm/chapel_libc_wrapper.h
warning: Unknown CUDA version. cuda.h: CUDA_VERSION=11060. Assuming the latest supported version 10.1

# Check to see if ptxas command can be found
which ptxas > /dev/null 2>&1

# Check to see if fatbinary command can be found
which fatbinary > /dev/null 2>&1

# PTX to  object file
ptxas -m64 --gpu-name sm_62 --output-file tmp_jacobi/chpl__gpu_ptx.o tmp_jacobi/chpl__gpu_ptx.s
ptxas tmp_jacobi/chpl__gpu_ptx.s, line 61439; error   : Feature 'labels1 - labels2 expression in .section' requires PTX ISA .version 7.5 or later
ptxas tmp_jacobi/chpl__gpu_ptx.s, line 61606; error   : Feature 'labels1 - labels2 expression in .section' requires PTX ISA .version 7.5 or later
ptxas fatal   : Ptx assembly aborted due to errors
error: PTX to  object file

The generated PTX specifies version 7.2:

(base) milthorpe@whale:~/chapel$ grep .version tmp_jacobi/chpl__gpu_ptx.s
.version 7.2

however, it uses a debugging directive to define a .section dwarf-line by a difference between labels that is only available from PTX version 7.5:

(base) milthorpe@whale:~/chapel$ sed -n '61439p' tmp_jacobi/chpl__gpu_ptx.s
.b32 LpubNames_end0-LpubNames_start0

Configuration Information

milthorpe commented 2 years ago

With CUDA 11.4, the error message is different, but the reason is the same:

# PTX to  object file
ptxas -m64 --gpu-name sm_62 --output-file tmp_jacobi/chpl__gpu_ptx.o tmp_jacobi/chpl__gpu_ptx.s
ptxas tmp_jacobi/chpl__gpu_ptx.s, line 61406; fatal   : Parsing error near '-': syntax error
ptxas fatal   : Ptx assembly aborted due to errors
error: PTX to  object file
e-kayrakli commented 2 years ago

Thanks @milthorpe! I moved the related internal issue with my notes to https://github.com/chapel-lang/chapel/issues/19774.