ParRes / Kernels

This is a set of simple programs that can be used to explore the features of a parallel platform.
https://groups.google.com/forum/#!forum/parallel-research-kernels
Other
409 stars 107 forks source link

nvhpc offloading on RH8/p9+V100 system: supported? #575

Closed hattom closed 3 years ago

hattom commented 3 years ago

What type of issue is this?

If this is a bug report, please use the following template. Otherwise, please delete the rest of the template.

Where does this bug appear?

Check all that apply:

Operating system

What is the output of uname -a?

Linux login02 4.18.0-147.13.2.el8_1.ppc64le #1 SMP Wed May 13 15:23:36 UTC 2020 ppc64le ppc64le ppc64le GNU/Linux

Compiler

What is the output of ${COMPILER} -v or ${COMPILER} --version?

$ nvfortran -V

nvfortran 20.11-0 linuxpower target on Linuxpower 
NVIDIA Compilers and Tools
Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.

PRK build information

Please attach or inline make.defs. cd common; cp make.defs{.nvhpc,}

Output showing problem

Shown for first target, but make -k shows the same problem for each target.

$ cd FORTRAN; make target
nvfortran -DNVHPC -O2 -Wall  -DRADIUS=2 -DSTAR -DNVHPC -mp -mp -target=gpu -gpu=managed -Minfo=accel -DGPU_SCHEDULE="schedule(static,1)" stencil-openmp-target.F90 -o stencil-openmp-target
nvfortran-Fatal-/m100/prod/opt/compilers/hpc-sdk/2020/binary/Linux_ppc64le/20.11/compilers/bin/tools/fort2 TERMINATED by signal 11
Arguments to /m100/prod/opt/compilers/hpc-sdk/2020/binary/Linux_ppc64le/20.11/compilers/bin/tools/fort2
/m100/prod/opt/compilers/hpc-sdk/2020/binary/Linux_ppc64le/20.11/compilers/bin/tools/fort2 /tmp/nvfortran5clfRT7_C8RS.ilm -fn stencil-openmp-target.F90 -opt 2 -terse 1 -inform warn -x 51 0x20 -x 119 0xa10000 -x 122 0x40 -x 123 0x1000 -x 127 4 -x 127 17 -x 119 0x40000000 -x 19 0x400000 -x 120 0x10000000 -x 70 0x8000 -x 122 1 -x 125 0x20000 -x 56 0x40 -vect 56 -y 34 16 -x 34 0x8 -y 19 8 -y 35 0 -x 42 0x30 -x 39 0x80 -x 34 0x400000 -x 149 1 -x 150 1 -x 120 0x1000 -x 124 0x1400 -y 15 2 -x 57 0x3b0000 -x 58 0x48000000 -x 49 0x100 -astype 0 -x 121 1 -x 183 4 -x 121 0x800 -x 8 0x40000000 -x 70 0x40000000 -x 54 0x10 -x 180 0x4000000 -x 233 0x1 -x 194 0x20000000 -x 233 0x100 -x 233 0x400 -x 180 0x4000000 -x 233 0x1 -x 194 0x20000000 -x 233 0x100 -x 233 0x400 -x 198 0x100 -x 233 0x10000 -x 249 100 -x 68 0x20 -x 8 0x40000000 -x 56 0x10 -x 39 4 -x 68 0x1 -x 49 0x40000000 -x 26 0x10 -x 26 1 -x 56 0x4000 -x 85 0x2000 -x 85 0x4000 -x 164 0x800000 -x 124 1 -accel tesla -accel tesla -x 180 0x4000400 -x 121 0xc00 -x 186 0x80 -x 163 0x1 -x 186 0x80000 -cudaver 11000 -x 194 0x40000 -x 176 0x100 -cudacap 70 -x 180 0x4000400 -x 121 0xc00 -x 186 0x80 -x 163 0x1 -x 186 0x80000 -cudaver 11000 -x 194 0x40000 -x 176 0x100 -cudacap 70 -cudaroot /m100/prod/opt/compilers/hpc-sdk/2020/binary/Linux_ppc64le/20.11/cuda/11.0 -x 176 0x100 -cudacap 70 -x 189 0x8000 -y 163 0xc0000000 -x 189 0x10 -y 189 0x4000000 -cudaroot /m100/prod/opt/compilers/hpc-sdk/2020/binary/Linux_ppc64le/20.11/cuda/11.0 -x 9 1 -x 72 0x1 -x 136 0x11 -x 37 0x480000 -mp -x 69 0x200 -x 69 0x400 -x 69 2 -mp -x 69 0x200 -x 69 0x400 -x 69 2 -x 194 0x20000000 -x 198 0x100 -x 0 0x1000000 -x 2 0x100000 -x 0 0x2000000 -x 161 16384 -x 162 16384 -cci /tmp/nvfortranbclfdB5wBKuo.cci -cmdline '+nvfortran stencil-openmp-target.F90 -DNVHPC -O2 -Mvect=simd -Wall -DRADIUS=2 -DSTAR -DNVHPC -mp -mp -target=gpu -gpu=managed -Minfo=accel -DGPU_SCHEDULE=schedule(static,1) -o stencil-openmp-target' -stbfile /tmp/nvfortranjclfBZKYhVw1.stb -asm /tmp/nvfortranPclf7I7aDmQk.ll
make: *** [Makefile:127: stencil-openmp-target] Error 127

Maybe offloading is known to not work with NVHPC/20.11 on P9?

If the output is short, please inline it here. Otherwise, please pipe it to a plain text file and attach that file. Note that you may need to use $command 2>&1 $log to capture the error messages.

Please do not attach screenshots of your terminal.

hattom commented 3 years ago

Based on the documentation, I'm thinking that the version of nvhpc is not new enough for OpenMP offloading? I didn't bisect, but it seems to be somewhere between 20.11 and 21.03.

jeffhammond commented 3 years ago

I'm only using NVHPC 21.2 and later. If those work, I'm not inclined to look backwards.

hattom commented 3 years ago

No problem. At the time I opened the issue, I thought 20.11 was also supported. It seems 21.1 brought OpenMP GPU support.

hattom commented 3 years ago

For future reference: 20.11 and 21.2 seem not to have good support for such features on Power targets. 21.3 seems a lot better.

jeffhammond commented 3 years ago

Not a huge surprise. OpenMP target support is a new feature and I would expect continuous, positive change on this front.