tan2 / geoflac-old

Code for lithospheric scale geodynamics
7 stars 11 forks source link

Error while building for OpenACC support with pgf90 21.3 #5

Closed echoi closed 3 years ago

echoi commented 3 years ago

Hi, I'm reporting an issue I ran into while trying to build geoflac with OpenAcc support. I got pgf90 as a part of NVIDIA HPC SDK 21.3. A working recipe for building geoflac with OpenACC would be much appreciated!

(base) echoi@ptah:~/opt/geoflac/src$ make
pgf90 -g -acc=gpu -Mcuda -Minfo=accel -O2 -Minfo=all -c myrandom_mod.f90
myrandom:
      9, Generating acc routine seq
         Generating Tesla code
pgf90 -g -acc=gpu -Mcuda -Minfo=accel -O2 -Minfo=all -c params.f90
pgf90 -g -acc=gpu -Mcuda -Minfo=accel -O2 -Minfo=all -c arrays.f90
pgf90 -g -acc=gpu -Mcuda -Minfo=accel -O2 -Minfo=all -c phases.f90
pgf90 -g -acc=gpu -Mcuda -Minfo=accel -O2 -Minfo=all -c marker_data.f90
allocate_markers:
     26, Generating update device(max_markers)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - nmark_elem$p (marker_data.f90: 72)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - nmark_elem$sd (marker_data.f90: 72)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_id_elem$p (marker_data.f90: 85)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_id_elem$sd (marker_data.f90: 85)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_x$p (marker_data.f90: 87)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_x$sd (marker_data.f90: 87)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_y$p (marker_data.f90: 88)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_y$sd (marker_data.f90: 88)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_dead$p (marker_data.f90: 89)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_dead$sd (marker_data.f90: 89)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_id$p (marker_data.f90: 90)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_id$sd (marker_data.f90: 90)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_a1$p (marker_data.f90: 91)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_a1$sd (marker_data.f90: 91)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_a2$p (marker_data.f90: 92)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_a2$sd (marker_data.f90: 92)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_age$p (marker_data.f90: 93)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_age$sd (marker_data.f90: 93)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_ntriag$p (marker_data.f90: 94)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_ntriag$sd (marker_data.f90: 94)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_phase$p (marker_data.f90: 95)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_phase$sd (marker_data.f90: 95)
add_marker:
     41, Generating acc routine seq
         Generating Tesla code
  0 inform,   0 warnings,  22 severes, 0 fatal for add_marker
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - phase_ratio$sd (marker_data.f90: 129)
newphase2marker:
    115, Loop unrolled 2 times
    121, Memory zero idiom, loop replaced by call to __c_mzero8
  0 inform,   0 warnings,   1 severes, 0 fatal for newphase2marker
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - nmark_elem$p (marker_data.f90: 140)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - nmark_elem$sd (marker_data.f90: 140)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - iphase$p (marker_data.f90: 163)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - iphase$sd (marker_data.f90: 163)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_id_elem$p (marker_data.f90: 144)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_id_elem$sd (marker_data.f90: 144)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_phase$p (marker_data.f90: 145)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_phase$sd (marker_data.f90: 145)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - phase_ratio$p (marker_data.f90: 150)
NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - phase_ratio$sd (marker_data.f90: 150)
count_phase_ratio:
    132, Generating acc routine seq
         Generating Tesla code
    142, Memory zero idiom, loop replaced by call to __c_mzero4
    143, Loop not vectorized: data dependency
    149, Generated vector simd code for the loop
  0 inform,   0 warnings,  10 severes, 0 fatal for count_phase_ratio
make: *** [Makefile:216: marker_data.o] Error 2

(base) echoi@ptah:~/opt/geoflac/src$ pgf90 --version

pgf90 (aka nvfortran) 21.3-0 LLVM 64-bit target on x86-64 Linux -tp haswell
PGI Compilers and Tools
Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.
tan2 commented 3 years ago

It seems that you are using different CFLAGS. The openacc code is using nvidia's "managed memory" and must be compiled with the flag "-ta=manage" or something equivalent. Can you add this flag?

On Thu, May 20, 2021 at 9:18 PM Eunseo Choi @.***> wrote:

Hi, I'm reporting an issue I ran into while trying to build geoflac with OpenAcc support. I got pgf90 as a part of NVIDIA HPC SDK 21.3. A working recipe for building geoflac with OpenACC would be much appreciated!

(base) @.:~/opt/geoflac/src$ make pgf90 -g -acc=gpu -Mcuda -Minfo=accel -O2 -Minfo=all -c myrandom_mod.f90 myrandom: 9, Generating acc routine seq Generating Tesla code pgf90 -g -acc=gpu -Mcuda -Minfo=accel -O2 -Minfo=all -c params.f90 pgf90 -g -acc=gpu -Mcuda -Minfo=accel -O2 -Minfo=all -c arrays.f90 pgf90 -g -acc=gpu -Mcuda -Minfo=accel -O2 -Minfo=all -c phases.f90 pgf90 -g -acc=gpu -Mcuda -Minfo=accel -O2 -Minfo=all -c marker_data.f90 allocate_markers: 26, Generating update device(max_markers) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - nmark_elem$p (marker_data.f90: 72) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - nmark_elem$sd (marker_data.f90: 72) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_id_elem$p (marker_data.f90: 85) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_id_elem$sd (marker_data.f90: 85) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_x$p (marker_data.f90: 87) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_x$sd (marker_data.f90: 87) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_y$p (marker_data.f90: 88) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_y$sd (marker_data.f90: 88) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_dead$p (marker_data.f90: 89) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_dead$sd (marker_data.f90: 89) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_id$p (marker_data.f90: 90) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_id$sd (marker_data.f90: 90) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_a1$p (marker_data.f90: 91) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_a1$sd (marker_data.f90: 91) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_a2$p (marker_data.f90: 92) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_a2$sd (marker_data.f90: 92) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_age$p (marker_data.f90: 93) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_age$sd (marker_data.f90: 93) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_ntriag$p (marker_data.f90: 94) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_ntriag$sd (marker_data.f90: 94) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_phase$p (marker_data.f90: 95) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_phase$sd (marker_data.f90: 95) add_marker: 41, Generating acc routine seq Generating Tesla code 0 inform, 0 warnings, 22 severes, 0 fatal for add_marker NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - phase_ratio$sd (marker_data.f90: 129) newphase2marker: 115, Loop unrolled 2 times 121, Memory zero idiom, loop replaced by call to __c_mzero8 0 inform, 0 warnings, 1 severes, 0 fatal for newphase2marker NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - nmark_elem$p (marker_data.f90: 140) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - nmark_elem$sd (marker_data.f90: 140) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - iphase$p (marker_data.f90: 163) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - iphase$sd (marker_data.f90: 163) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_id_elem$p (marker_data.f90: 144) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_id_elem$sd (marker_data.f90: 144) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_phase$p (marker_data.f90: 145) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_phase$sd (marker_data.f90: 145) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - phase_ratio$p (marker_data.f90: 150) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - phase_ratio$sd (marker_data.f90: 150) count_phase_ratio: 132, Generating acc routine seq Generating Tesla code 142, Memory zero idiom, loop replaced by call to __c_mzero4 143, Loop not vectorized: data dependency 149, Generated vector simd code for the loop 0 inform, 0 warnings, 10 severes, 0 fatal for count_phase_ratio make: [Makefile:216: marker_data.o] Error 2 (base) @.:~/opt/geoflac/src$ pgif90 --version pgif90: command not found (base) @.:~/opt/geoflac/src$ pgf90 --version

pgf90 (aka nvfortran) 21.3-0 LLVM 64-bit target on x86-64 Linux -tp haswell PGI Compilers and Tools Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/tan2/geoflac/issues/5, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABWQQSOUIZL4O576RVGST3TOUD4DANCNFSM45G76VHA .

echoi commented 3 years ago

The original error was fixed with an option, -ta=tesla:cc60,managed. However, another error occurred during compilation of bc_updated.f90:

pgf90 -g -acc=gpu -Mcuda -Minfo=accel -ta=tesla:cc60,managed -O2 -Minfo=all -c bc_update.f90
bc_update:
     18, Generating implicit copyout(force(:,:,:)) [if not already present]
     19, Loop is parallelizable
         Generating Tesla code
         19,   ! blockidx%x threadidx%x auto-collapsed
             !$acc loop gang, vector(128) collapse(3) ! blockidx%x threadidx%x
     23, Accelerator serial kernel generated
         Generating Tesla code
         25, !$acc do seq
         57, !$acc do seq
     23, Generating implicit copyin(force(:,:,1:2)) [if not already present]
         Generating implicit copyout(force(:,nx,1:2)) [if not already present]
         Generating implicit copyin(iphase(:,:)) [if not already present]
         Generating implicit copy(j) [if not already present]
         Generating implicit copyin(cord(:,:,1:2)) [if not already present]
     26, Accelerator restriction: induction variable live-out from loop: j
     28, Accelerator restriction: induction variable live-out from loop: j
     29, Accelerator restriction: induction variable live-out from loop: j
     35, Accelerator restriction: induction variable live-out from loop: j
     36, Accelerator restriction: induction variable live-out from loop: j
     40, Accelerator restriction: induction variable live-out from loop: j
     41, Accelerator restriction: induction variable live-out from loop: j
     45, Accelerator restriction: induction variable live-out from loop: j
     46, Accelerator restriction: induction variable live-out from loop: j
     49, Accelerator restriction: induction variable live-out from loop: j
     58, Accelerator restriction: induction variable live-out from loop: j
     60, Accelerator restriction: induction variable live-out from loop: j
     61, Accelerator restriction: induction variable live-out from loop: j
     68, Accelerator restriction: induction variable live-out from loop: j
     69, Accelerator restriction: induction variable live-out from loop: j
     74, Accelerator restriction: induction variable live-out from loop: j
     75, Accelerator restriction: induction variable live-out from loop: j
     78, Accelerator restriction: induction variable live-out from loop: j
     79, Accelerator restriction: induction variable live-out from loop: j
     82, Accelerator restriction: induction variable live-out from loop: j
     90, Generating implicit copyin(cord(:,:,1:2),nopbou(1:nopbmax,1:4)) [if not already present]
         Generating implicit copy(force(:,:,:)) [if not already present]
         Generating implicit copyin(bcstress(1:nopbmax,1:2)) [if not already present]
     91, Complex loop carried dependence of force prevents parallelization
         Loop carried dependence due to exposed use of force(:,:,:) prevents parallelization
         Accelerator serial kernel generated
         Generating Tesla code
         91, !$acc loop seq
nvvmCompileProgram error 9: NVVM_ERROR_COMPILATION.
Error: /tmp/pgaccA26soObjQ4KE.gpu (1041, 4): parse multiple definition of local value named 'li1167_tca0'
ptxas /tmp/pgacc626sU5dvkTRA.ptx, line 1; fatal   : Missing .version directive at start of file '/tmp/pgacc626sU5dvkTRA.ptx'
ptxas fatal   : Ptx assembly aborted due to errors
NVFORTRAN-F-0155-Compiler failed to translate accelerator region (see -Minfo messages): Device compiler exited with error status code (bc_update.f90: 90)
NVFORTRAN/x86-64 Linux 21.3-0: compilation aborted
make: *** [Makefile:219: bc_update.o] Error 2

According to https://forums.developer.nvidia.com/t/nv-21-3-fails-to-compile-my-openacc-code/176435, NVIDIA HPC SDK 21.3 has some problems including this one. The solution suggested in the nvidia forum did fix the error, which is to add -Mx,231,0x01.

The building is now completed on my machine (see below for specs) but a segmentation fault occurs when the executable is run with examples/subduction.inp.

$ nvidia-smi
Thu May 20 09:53:33 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.73.01    Driver Version: 460.73.01    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce GTX 1080    Off  | 00000000:04:00.0  On |                  N/A |
|  0%   38C    P8     9W / 215W |     78MiB /  8111MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

$ pgf90 -V

pgf90 (aka nvfortran) 21.3-0 LLVM 64-bit target on x86-64 Linux -tp haswell
PGI Compilers and Tools
Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.
echoi commented 3 years ago

The issue has been resolved. NVIDIA SDK 21.2 was tried and found to work. For my GTX-1080 GPU, I had only to change cc70 to cc60 in the compile and linking options.

Although this is a separate issue, I report here that the OpenMP version built with pgf90 crashes during remeshing on my machine. When built with gfortran, however, the OpenMP version works fine.