Prototype GPU support with -g debug flag generates invalid PTX

milthorpe commented 2 years ago

Summary of Problem

Compiling with CHPL_LOCALE_MODEL=gpu chpl -g results in a ptxas error on invalid PTX.

Steps to Reproduce

Compiling with Jacobi GPU example with the debug flag -g results in an error when running ptxas on the generated code:

(base) milthorpe@whale:~/chapel$ CHPL_LAUNCHER=none CHPL_COMM=none CHPL_LOCALE_MODEL=gpu chpl -g --gpu-arch sm_62 jacobi-gpu.chpl --ldflags -v --print-commands --savec tmp_jacobi
warning: The prototype GPU support implies --no-checks. This may impact debuggability. To suppress this warning, compile with --no-checks explicitly
<internal clang code generation> -I/home/milthorpe/chapel/modules/standard -I/home/milthorpe/chapel/modules/packages -I/home/milthorpe/chapel/runtime/include/localeModels/gpu -I/home/milthorpe/chapel/runtime/include/localeModels -I/home/milthorpe/chapel/runtime/include/comm/none -I/home/milthorpe/chapel/runtime/include/comm -I/home/milthorpe/chapel/runtime/include/tasks/qthreads -I/home/milthorpe/chapel/runtime/include -I/home/milthorpe/chapel/runtime/include/qio -I/home/milthorpe/chapel/runtime/include/atomics/cstdlib -I/home/milthorpe/chapel/runtime/include/mem/jemalloc -I/home/milthorpe/chapel/third-party/utf8-decoder -DHAS_GPU_LOCALE -I/home/milthorpe/chapel/runtime/include/gpu/cuda -DCHPL_JEMALLOC_PREFIX=chpl_je_ -I/home/milthorpe/chapel/third-party/gmp/install/linux64-x86_64-native-llvm-none/include -I/home/milthorpe/chapel/third-party/hwloc/install/linux64-x86_64-native-llvm-none-gpu/include -I/home/milthorpe/chapel/third-party/qthread/install/linux64-x86_64-native-llvm-none-gpu-jemalloc-bundled/include -I/home/milthorpe/chapel/third-party/jemalloc/install/target/linux64-x86_64-native-llvm-none/include -I/home/milthorpe/chapel/third-party/re2/install/linux64-x86_64-native-llvm-none/include -I. -Itmp_jacobi -g -DCHPL_DEBUG -I/home/milthorpe/chapel/modules/internal -DCHPL_GEN_CODE -pthread -I/usr/local/cuda/include --std=c++11 -x cuda --cuda-gpu-arch=sm_62 -include sys_basic.h -include tmp_jacobi/command-line-includes.h -include llvm/chapel_libc_wrapper.h
warning: Unknown CUDA version. cuda.h: CUDA_VERSION=11060. Assuming the latest supported version 10.1

# Check to see if ptxas command can be found
which ptxas > /dev/null 2>&1

# Check to see if fatbinary command can be found
which fatbinary > /dev/null 2>&1

# PTX to  object file
ptxas -m64 --gpu-name sm_62 --output-file tmp_jacobi/chpl__gpu_ptx.o tmp_jacobi/chpl__gpu_ptx.s
ptxas tmp_jacobi/chpl__gpu_ptx.s, line 61439; error   : Feature 'labels1 - labels2 expression in .section' requires PTX ISA .version 7.5 or later
ptxas tmp_jacobi/chpl__gpu_ptx.s, line 61606; error   : Feature 'labels1 - labels2 expression in .section' requires PTX ISA .version 7.5 or later
ptxas fatal   : Ptx assembly aborted due to errors
error: PTX to  object file

The generated PTX specifies version 7.2:

(base) milthorpe@whale:~/chapel$ grep .version tmp_jacobi/chpl__gpu_ptx.s
.version 7.2

however, it uses a debugging directive to define a .section dwarf-line by a difference between labels that is only available from PTX version 7.5:

(base) milthorpe@whale:~/chapel$ sed -n '61439p' tmp_jacobi/chpl__gpu_ptx.s
.b32 LpubNames_end0-LpubNames_start0

Configuration Information

Output of chpl --version:

(base) milthorpe@whale:~/chapel$ chpl --version
chpl version 1.27.0 pre-release (fee983b5a4)
built with LLVM version 13.0.0
Copyright 2020-2022 Hewlett Packard Enterprise Development LP
Copyright 2004-2019 Cray Inc.
(See LICENSE file for more details)

Output of $CHPL_HOME/util/printchplenv --anonymize:

CHPL_TARGET_PLATFORM: linux64
CHPL_TARGET_COMPILER: llvm
CHPL_TARGET_ARCH: x86_64
CHPL_TARGET_CPU: native
CHPL_LOCALE_MODEL: gpu *
CHPL_COMM: none *
CHPL_TASKS: qthreads
CHPL_LAUNCHER: none *
CHPL_TIMERS: generic
CHPL_UNWIND: none
CHPL_MEM: jemalloc
CHPL_ATOMICS: cstdlib
CHPL_GMP: bundled
CHPL_HWLOC: bundled
CHPL_RE2: bundled
CHPL_LLVM: bundled +
CHPL_AUX_FILESYS: none

Back-end compiler and version, e.g. gcc --version or clang --version:

clang version 13.0.0 (git@github.com:milthorpe/chapel.git 298724cbcb3712d300a98d4734d6e233b87ad201)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /home/milthorpe/chapel/third-party/llvm/install/linux64-x86_64-gnu/bin

milthorpe commented 2 years ago

With CUDA 11.4, the error message is different, but the reason is the same:

# PTX to  object file
ptxas -m64 --gpu-name sm_62 --output-file tmp_jacobi/chpl__gpu_ptx.o tmp_jacobi/chpl__gpu_ptx.s
ptxas tmp_jacobi/chpl__gpu_ptx.s, line 61406; fatal   : Parsing error near '-': syntax error
ptxas fatal   : Ptx assembly aborted due to errors
error: PTX to  object file

e-kayrakli commented 2 years ago

Thanks @milthorpe! I moved the related internal issue with my notes to https://github.com/chapel-lang/chapel/issues/19774.

chapel-lang / chapel