Closed owainkenwayucl closed 1 year ago
I can't even replicate the problem with the 22.9 compiler - reinstalling my local install worked:
Myriad [login13] nvidia-hpc-sdk :) > less /home/uccaoke/Applications/nvhpc/2022_229/nvidia-2022-22.9/Linux_x86_64/22.9/compilers/bin/localrc
set LFC=-lgfortran;
set LDSO=/lib64/ld-linux-x86-64.so.2;
set GCCDIR=/usr/lib/gcc/x86_64-redhat-linux/4.8.5;
set G77DIR=/usr/lib/gcc/x86_64-redhat-linux/4.8.5/;
set OEM_INFO=64-bit target on x86-64 Linux $INFOTPVAL;
set GNUATOMIC=-latomic;
set GCCINC= /shared/ucl/apps/nvhpc/2022_229/Linux_x86_64/22.9/compilers/extras/qd/include /shared/ucl/apps/nvhpc/2022_229/Linux_x86_64/22.9/cuda/11.7/extras/CUPTI/include /shared/ucl/apps/nvhpc/2022_229/Linux_x86_64/22.9/comm_libs/nvshmem/include /shared/ucl/apps/nvhpc/2022_229/Linux_x86_64/22.9/comm_libs/nccl/include /shared/ucl/apps/nvhpc/2022_229/Linux_x86_64/22.9/comm_libs/mpi/include /shared/ucl/apps/nvhpc/2022_229/Linux_x86_64/22.9/math_libs/include /shared/ucl/apps/nvhpc/2022_229/Linux_x86_64/22.9/compilers/include /shared/ucl/apps/nvhpc/2022_229/Linux_x86_64/22.9/cuda/include /shared/ucl/apps/emacs/28.1/include /shared/ucl/apps/giflib/5.1.1/gnu-4.9.2/include /shared/ucl/apps/apr-util/1.6.1/include /shared/ucl/apps/apr/1.7.0/include /shared/ucl/apps/flex/2.5.39/gnu-4.9.2/include /usr/lib/gcc/x86_64-redhat-linux/4.8.5/include /usr/local/include /usr/include;
set GPPDIR= /shared/ucl/apps/nvhpc/2022_229/Linux_x86_64/22.9/compilers/extras/qd/include /shared/ucl/apps/nvhpc/2022_229/Linux_x86_64/22.9/cuda/11.7/extras/CUPTI/include /shared/ucl/apps/nvhpc/2022_229/Linux_x86_64/22.9/comm_libs/nvshmem/include /shared/ucl/apps/nvhpc/2022_229/Linux_x86_64/22.9/comm_libs/nccl/include /shared/ucl/apps/nvhpc/2022_229/Linux_x86_64/22.9/comm_libs/mpi/include /shared/ucl/apps/nvhpc/2022_229/Linux_x86_64/22.9/math_libs/include /shared/ucl/apps/nvhpc/2022_229/Linux_x86_64/22.9/compilers/include /shared/ucl/apps/nvhpc/2022_229/Linux_x86_64/22.9/cuda/include /shared/ucl/apps/emacs/28.1/include /shared/ucl/apps/giflib/5.1.1/gnu-4.9.2/include /shared/ucl/apps/apr-util/1.6.1/include /shared/ucl/apps/apr/1.7.0/include /shared/ucl/apps/flex/2.5.39/gnu-4.9.2/include /usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../include/c++/4.8.5 /usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../include/c++/4.8.5/x86_64-redhat-linux /usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../include/c++/4.8.5/backward /usr/lib/gcc/x86_64-redhat-linux/4.8.5/include /usr/local/include /usr/include;
set NUMALIBNAME=-lnuma ;
set LOCALRC=YES;
set EXTENSION=__extension__=;
set LC=-lgcc -lc $if(-Bstatic,-lgcc_eh, -lgcc_s);
set DEFCUDAVERSION=10.2;
set DEFSTDPARCOMPUTECAP=;
# GLIBC version 2.17
# GCC version 4.8.5
set GCCVERSION=40805;
set LIBNCURSES=YES;
export PGI=$COMPBASE;
Maybe it's something that changed in ccspapp
's environment?
Re-running the script as ccspapp
fixes it.
Computers.
I am re-running the install on all clusters (already fixed Myriad + Kathleen).
Fixed on all clusters.
The C++ compiler
nvc++
is somehow misconfigured in 22.9 (it's fine in previous versions) and as a result it cannot create binaries.The root cause seems to be it setting:
in
/shared/ucl/apps/nvhpc/2022_229/Linux_x86_64/22.9/compilers/bin/localrc
rather than a list of directories for includes like earlier versions do:Steps
1: work out if this can be fixed when we trigger the installer and fix the scripts. If it can't we have to patch
localrc
ourselves.2: replicate with newest Nvidia HPC toolkit to see if this is a bug that has been fixed.