Open zimb3l-priv opened 10 months ago
Hello,
For the compilation error you mention above, you may try removing the NVHPC_CUDA_HOME
variable, and let the compiler use its default value. Even if both standalone CUDA12.2 and the NVHPC version match, we cannot be sure the file structure is identical.
As for the global problem, I suggest you used an NVHPC package under 22.7 to compile and run. A note mentioning the operational configurations was left inside the readme.md (Prerequisites). A runtime compiler bug prevents us from running, while using any version above. Luckily, the CUDA Driver is retro-compatible. You can even switch the package to 11.0 version before compiling. This will not impact the performances.
Ok, I switched to HPC 22.7 and set the cuda-version in the install script to 11.7 as that's the one in my HPC folder also.
I also installed GCC 11.2 and set GNURUOOT=/opt/rh/gcc-toolset-11/root/usr/bin/ before executing the install script so my prerequisites are now HPC-SDK 22.7 + cuda11.7 + GNU-11.2.1 which is compliant with what's written in the recommended section.
Unfortunately the compilation still crashes with the following output:
.
.
.
mpif90 -cpp -traceback -g -fast -Mdalign -Minline=maxsize:340 -r8 -cuda -gpu=cc60,cc70,cc80,cc86,cuda11.7,unroll -c nblistcu.f
mpif90 -cpp -traceback -g -fast -Mdalign -Minline=maxsize:340 -r8 -cuda -gpu=cc60,cc70,cc80,cc86,cuda11.7,unroll -c pmestuffcu.f
mpif90 -cpp -traceback -g -fast -Mdalign -Minline=maxsize:340 -r8 -cuda -gpu=cc60,cc70,cc80,cc86,cuda11.7,unroll -c tmatxb_pmecu.f
mpif90 -cpp -traceback -g -fast -Mdalign -Minline=maxsize:340 -r8 -cuda -gpu=cc60,cc70,cc80,cc86,cuda11.7,unroll -c tmatxb_pme_cpen.cu.f
NVFORTRAN-F-0000-Internal compiler error. readin_func: too many ilms 2413 (echgtrncu.f: 87)
NVFORTRAN/x86-64 Linux 22.7-0: compilation aborted
make[1]: *** [Makefile:570: echgtrncu.o] Error 2
make[1]: *** Waiting for unfinished jobs....
make[1]: Leaving directory '/software/Tinker-HP/tinker-hp/GPU/build0'
make: *** [Makefile:480: libtinker] Error 2
------ WARNING ------
Something went wrong during compilation procedure "
Please Fix the issue and run ci/install.sh again"
---------------------
The same happened when I previously tried using GCC 9.2.1 (so HPC-SDK 22.7 + cuda11.7 + GNU-9.2.1) and that's why I installed 11.2.1 in the first place
Not sure what else I could try here... Maybe getting HPC 22.2?
Since the only available CUDA Versions on my system are 11.2, 12.1 and 12.2 I tried to install the GPU version with the newest available versions. If this is already an error or if this is rather a problem with NVIDIA HPC instead of Tinker; ignore the rest and inform me. Otherwise;
Description: When attempting to compile Tinker-HP with CUDA version 12.2 and NVIDIA HPC SDK version 23.7, compilation errors are encountered related to standard C++ math functions not being recognized within the std namespace.
Environment:
Steps to Reproduce:
Expected Behavior: The compilation should recognize standard math functions from the C++ standard library and compile without errors.
Actual Behavior: The following errors are displayed during the compilation process:
(Additional similar errors for other math functions like cosh, atan, atan2, tan, tanh, etc.)
Additional Information:
I also created a file to source with a couple of paths to ensure Tinker-HP uses the right ones during compilations. Maybe someone sees an error here:
Attempted Fixes:
Request: Assistance is requested to resolve the compilation issues related to the standard C++ library functions in CUDA 12.2 headers when using the NVIDIA HPC SDK. Any known fixes, patches, or suggestions to bypass these errors would be greatly appreciated.