Closed echoi closed 3 years ago
It seems that you are using different CFLAGS. The openacc code is using nvidia's "managed memory" and must be compiled with the flag "-ta=manage" or something equivalent. Can you add this flag?
On Thu, May 20, 2021 at 9:18 PM Eunseo Choi @.***> wrote:
Hi, I'm reporting an issue I ran into while trying to build geoflac with OpenAcc support. I got pgf90 as a part of NVIDIA HPC SDK 21.3. A working recipe for building geoflac with OpenACC would be much appreciated!
(base) @.:~/opt/geoflac/src$ make pgf90 -g -acc=gpu -Mcuda -Minfo=accel -O2 -Minfo=all -c myrandom_mod.f90 myrandom: 9, Generating acc routine seq Generating Tesla code pgf90 -g -acc=gpu -Mcuda -Minfo=accel -O2 -Minfo=all -c params.f90 pgf90 -g -acc=gpu -Mcuda -Minfo=accel -O2 -Minfo=all -c arrays.f90 pgf90 -g -acc=gpu -Mcuda -Minfo=accel -O2 -Minfo=all -c phases.f90 pgf90 -g -acc=gpu -Mcuda -Minfo=accel -O2 -Minfo=all -c marker_data.f90 allocate_markers: 26, Generating update device(max_markers) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - nmark_elem$p (marker_data.f90: 72) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - nmark_elem$sd (marker_data.f90: 72) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_id_elem$p (marker_data.f90: 85) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_id_elem$sd (marker_data.f90: 85) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_x$p (marker_data.f90: 87) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_x$sd (marker_data.f90: 87) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_y$p (marker_data.f90: 88) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_y$sd (marker_data.f90: 88) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_dead$p (marker_data.f90: 89) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_dead$sd (marker_data.f90: 89) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_id$p (marker_data.f90: 90) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_id$sd (marker_data.f90: 90) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_a1$p (marker_data.f90: 91) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_a1$sd (marker_data.f90: 91) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_a2$p (marker_data.f90: 92) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_a2$sd (marker_data.f90: 92) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_age$p (marker_data.f90: 93) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_age$sd (marker_data.f90: 93) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_ntriag$p (marker_data.f90: 94) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_ntriag$sd (marker_data.f90: 94) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_phase$p (marker_data.f90: 95) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_phase$sd (marker_data.f90: 95) add_marker: 41, Generating acc routine seq Generating Tesla code 0 inform, 0 warnings, 22 severes, 0 fatal for add_marker NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - phase_ratio$sd (marker_data.f90: 129) newphase2marker: 115, Loop unrolled 2 times 121, Memory zero idiom, loop replaced by call to __c_mzero8 0 inform, 0 warnings, 1 severes, 0 fatal for newphase2marker NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - nmark_elem$p (marker_data.f90: 140) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - nmark_elem$sd (marker_data.f90: 140) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - iphase$p (marker_data.f90: 163) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - iphase$sd (marker_data.f90: 163) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_id_elem$p (marker_data.f90: 144) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_id_elem$sd (marker_data.f90: 144) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_phase$p (marker_data.f90: 145) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - mark_phase$sd (marker_data.f90: 145) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - phase_ratio$p (marker_data.f90: 150) NVFORTRAN-S-1054-Module variables used in acc routine need to be in acc declare create() - phase_ratio$sd (marker_data.f90: 150) count_phase_ratio: 132, Generating acc routine seq Generating Tesla code 142, Memory zero idiom, loop replaced by call to __c_mzero4 143, Loop not vectorized: data dependency 149, Generated vector simd code for the loop 0 inform, 0 warnings, 10 severes, 0 fatal for count_phase_ratio make: [Makefile:216: marker_data.o] Error 2 (base) @.:~/opt/geoflac/src$ pgif90 --version pgif90: command not found (base) @.:~/opt/geoflac/src$ pgf90 --version
pgf90 (aka nvfortran) 21.3-0 LLVM 64-bit target on x86-64 Linux -tp haswell PGI Compilers and Tools Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/tan2/geoflac/issues/5, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABWQQSOUIZL4O576RVGST3TOUD4DANCNFSM45G76VHA .
The original error was fixed with an option, -ta=tesla:cc60,managed
.
However, another error occurred during compilation of bc_updated.f90
:
pgf90 -g -acc=gpu -Mcuda -Minfo=accel -ta=tesla:cc60,managed -O2 -Minfo=all -c bc_update.f90
bc_update:
18, Generating implicit copyout(force(:,:,:)) [if not already present]
19, Loop is parallelizable
Generating Tesla code
19, ! blockidx%x threadidx%x auto-collapsed
!$acc loop gang, vector(128) collapse(3) ! blockidx%x threadidx%x
23, Accelerator serial kernel generated
Generating Tesla code
25, !$acc do seq
57, !$acc do seq
23, Generating implicit copyin(force(:,:,1:2)) [if not already present]
Generating implicit copyout(force(:,nx,1:2)) [if not already present]
Generating implicit copyin(iphase(:,:)) [if not already present]
Generating implicit copy(j) [if not already present]
Generating implicit copyin(cord(:,:,1:2)) [if not already present]
26, Accelerator restriction: induction variable live-out from loop: j
28, Accelerator restriction: induction variable live-out from loop: j
29, Accelerator restriction: induction variable live-out from loop: j
35, Accelerator restriction: induction variable live-out from loop: j
36, Accelerator restriction: induction variable live-out from loop: j
40, Accelerator restriction: induction variable live-out from loop: j
41, Accelerator restriction: induction variable live-out from loop: j
45, Accelerator restriction: induction variable live-out from loop: j
46, Accelerator restriction: induction variable live-out from loop: j
49, Accelerator restriction: induction variable live-out from loop: j
58, Accelerator restriction: induction variable live-out from loop: j
60, Accelerator restriction: induction variable live-out from loop: j
61, Accelerator restriction: induction variable live-out from loop: j
68, Accelerator restriction: induction variable live-out from loop: j
69, Accelerator restriction: induction variable live-out from loop: j
74, Accelerator restriction: induction variable live-out from loop: j
75, Accelerator restriction: induction variable live-out from loop: j
78, Accelerator restriction: induction variable live-out from loop: j
79, Accelerator restriction: induction variable live-out from loop: j
82, Accelerator restriction: induction variable live-out from loop: j
90, Generating implicit copyin(cord(:,:,1:2),nopbou(1:nopbmax,1:4)) [if not already present]
Generating implicit copy(force(:,:,:)) [if not already present]
Generating implicit copyin(bcstress(1:nopbmax,1:2)) [if not already present]
91, Complex loop carried dependence of force prevents parallelization
Loop carried dependence due to exposed use of force(:,:,:) prevents parallelization
Accelerator serial kernel generated
Generating Tesla code
91, !$acc loop seq
nvvmCompileProgram error 9: NVVM_ERROR_COMPILATION.
Error: /tmp/pgaccA26soObjQ4KE.gpu (1041, 4): parse multiple definition of local value named 'li1167_tca0'
ptxas /tmp/pgacc626sU5dvkTRA.ptx, line 1; fatal : Missing .version directive at start of file '/tmp/pgacc626sU5dvkTRA.ptx'
ptxas fatal : Ptx assembly aborted due to errors
NVFORTRAN-F-0155-Compiler failed to translate accelerator region (see -Minfo messages): Device compiler exited with error status code (bc_update.f90: 90)
NVFORTRAN/x86-64 Linux 21.3-0: compilation aborted
make: *** [Makefile:219: bc_update.o] Error 2
According to https://forums.developer.nvidia.com/t/nv-21-3-fails-to-compile-my-openacc-code/176435, NVIDIA HPC SDK 21.3 has some problems including this one. The solution suggested in the nvidia forum did fix the error, which is to add -Mx,231,0x01
.
The building is now completed on my machine (see below for specs) but a segmentation fault occurs when the executable is run with examples/subduction.inp
.
$ nvidia-smi
Thu May 20 09:53:33 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.73.01 Driver Version: 460.73.01 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce GTX 1080 Off | 00000000:04:00.0 On | N/A |
| 0% 38C P8 9W / 215W | 78MiB / 8111MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
$ pgf90 -V
pgf90 (aka nvfortran) 21.3-0 LLVM 64-bit target on x86-64 Linux -tp haswell
PGI Compilers and Tools
Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
The issue has been resolved.
NVIDIA SDK 21.2 was tried and found to work. For my GTX-1080 GPU, I had only to change cc70
to cc60
in the compile and linking options.
Although this is a separate issue, I report here that the OpenMP version built with pgf90 crashes during remeshing on my machine. When built with gfortran, however, the OpenMP version works fine.
Hi, I'm reporting an issue I ran into while trying to build geoflac with OpenAcc support. I got pgf90 as a part of NVIDIA HPC SDK 21.3. A working recipe for building geoflac with OpenACC would be much appreciated!