UCL-RITS / rcps-buildscripts

Scripts to automate package builds on RC Platforms
MIT License
39 stars 26 forks source link

Install Request: Intel OneAPI compiler 2024.2 and MPI #581

Open heatherkellyucl opened 1 month ago

heatherkellyucl commented 1 month ago

IN:06711429

Requested to be able to compile new NWChem.

Was asked about known good compilers:

I have used Intel Compilers 2022.2.1 and Intel MPI 2021.7.1 successfully before, although this is still a little on the older side now.

I'd rather go right for the ones in OneAPI 2024.2 which is the current release, if possible.

There is the question of whether FC, F77 and F90 should point to ifort vs ifx. Spack has gone for ifx as the default Fortran compiler in their wrappers now. It sounds like NWChem is fine with ifx, but not everything works with that yet.

Am wondering if we should have two modulefiles for this install, one in beta-modules (for genuine beta reasons) that sets the Fortran compiler environment variables to ifx. (They can always be set after loading the module in either case, and build systems may choose differently anyway).

(Note: NWChem itself will need https://nwchemgit.github.io/Compiling-NWChem.html#how-to-commodity-clusters-with-intel-omni-path for OmniPath with new GlobalArrays).

heatherkellyucl commented 1 month ago

Confirmed to go with 2024.2.

heatherkellyucl commented 1 month ago

Base installer:

This machine uses operating system "Red Hat Enterprise Linux version 7". Compatibility issues may occur.
Installation can continue; however, product functionality may not meet expectations because this product is untested on this operating system. Suggestion: Check the Release Notes for a list of supported operating systems and install this product on a compliant system.
Libnotify package is not installed
Intel® VTune(TM) Profiler requires Libnotify library for graphical user interface, it can be installed with
sudo apt-get install libnotify4 on Ubuntu / Debian
sudo zypper install libnotify4 on SUSE
sudo dnf install libnotify on CentOS / RHEL / Fedora
Linux* OS kernel headers not found.
The installer cannot detect the kernel source directory for OS kernel version 3.10.0-1160.53.1.el7.x86_64. It’s required for VTune Profiler Sampling drivers to be built and loaded automatically.
To install kernel headers, execute one of the following commands specific to your operating system: Fedora / Red Hat Enterprise Linux
On a system with the default kernel, install the kernel-devel package: yum install kernel-devel
On a system with the PAE kernel, install the kernel-PAE package: yum install kernel-PAE-devel
OpenSUSE / SUSE Linux Enterprise zypper install kernel-source
Ubuntu / Debian apt-get install linux-headers-3.10.0-1160.53.1.el7.x86_64

If kernel sources are already installed to custom directory, please follow the guide to build and load the Sampling drivers manually: https://www.intel.com/content/www/us/en/develop/documentation/vtune-help/top/set-up-analysis-target/linux-targets/build-install-sampling-drivers-for-linux-targets.html
Start installation flow...

...
Installation has successfully completed

Also HPCToolkit test install done, so I can get what the compiler and mpi versions are inside.

Buildscript updated. The modulefile creator probably needs updating based on what and where all the subcomponents are.

heatherkellyucl commented 1 month ago

Have updated the module file creator for the mpi module (the easy part!).

heatherkellyucl commented 1 month ago

Doing compiler part of compiler module, note from /shared/ucl/apps/intel/2024.2/compiler/2024.2/bin/compiler/README.txt

This directory contains binaries which are not typically directly invoked.
These binaries are not placed in PATH during environment setup in order to
reduce collisions with other toolchains.

This is a departure from OneAPI Compiler Releases before 2022.0 (December
2021), where these tools were in PATH. There may be cases where the tools which
are no longer in PATH were being invoked directly in some application Makefile
(or CMake configuration) and this may require adjustment:

1. If invoking "clang" or "clang++" directly, please try to use icx/icpx/dpcpp
   instead. Direct use of clang as a driver is not supported. The "clang"
   drivers do not have the same behavior as either icx or upstream LLVM, and
   are not widely tested.

2. If for some reason you must use the OneAPI clang/clang++, please report the
   issue. Until the issue is resolved, you can find the clang/clang++ driver's
   location by running "dpcpp --print-prog-name=clang".

3. If you use some other LLVM tool which is no longer in PATH, it can be found
   the same way. E.g.,: "dpcpp --print-prog-name=llvm-ar".

4. If all of the above fails, you can add the output of
   "dpcpp --print-file-name=bin-llvm" to PATH. This should be considered a last
   resort.

When switching from clang to icx/icpx/dpcpp, please be aware that the default
optimizations used by icx are much more aggressive. If this may be of concern,
you may specify the following flags to icx to very roughly approximate clang's
default optimizations:

- On Linux: -O0 -fno-fast-math
- On Windows: /O0 /fp:precise

Will not add /shared/ucl/apps/intel/2024.2/compiler/2024.2/bin/compiler/ to PATH.

heatherkellyucl commented 1 month ago

Done compiler out of:

heatherkellyucl commented 1 month ago

There's a -e PYTHONPATH:"$cprefix/advisor/$numeric_version/pythonapi" \ that could be added but I don't think we want to always be there in the compiler module so I'm going to leave that commented out for now.

heatherkellyucl commented 1 month ago

CCL: might want export CCL_CONFIGURATION="cpu_gpu_dpcpp" setting?

heatherkellyucl commented 1 month ago

As a note, 2024 puts a unified view of most things into /shared/ucl/apps/intel/2024.2/2024.2/lib/ for example, but that includes mpi and we don't want that present in the compiler module, so are using the /shared/ucl/apps/intel/2024.2/$componentname/$version locations instead.

heatherkellyucl commented 1 month ago

I think I have got everything in the buildscript now.

heatherkellyucl commented 1 month ago

Install on:

heatherkellyucl commented 1 month ago

Uh oh, 2024.2 does still have ifort but does not have icc or icpc anymore, and the clang-based icx and icpx require newer GLIBC so aren't going to work.

[cceahke@login02 cpp_pi_dir]$ icx --help
/lustre/shared/ucl/apps/intel/2024.2/compiler/2024.2/bin/compiler/clang: /lib64/libc.so.6: version `GLIBC_2.18' not found (required by /lustre/shared/ucl/apps/intel/2024.2/compiler/2024.2/bin/compiler/clang)
/lustre/shared/ucl/apps/intel/2024.2/compiler/2024.2/bin/compiler/clang: /lib64/libc.so.6: version `GLIBC_2.28' not found (required by /lustre/shared/ucl/apps/intel/2024.2/compiler/2024.2/bin/compiler/clang)

So the only working compiler in there is ifort.

heatherkellyucl commented 1 month ago

The modulefiles I did make are now in the https://github.com/UCL-RITS/rcps-modulefiles/tree/rhel-upgrade branch of the modulefile repo, as compilers/compilers/intel/2024.2 and libraries/mpi/intel/2021.13/intel.

heatherkellyucl commented 1 month ago

Faster way to check the environment:

module purge
module load gcc-libs/10.2.0

# write out current env into file
env > origenv

# source the intel mpi setup script only
source /shared/ucl/apps/intel/2024.2/mpi/2021.13/env/vars.sh
env > intelmpi2021.13_env

That gives you everything except CCL, which is also included in our MPI module. (Haven't checked what CCL brings in if you were to source its vars.sh too at that step, may be sufficient).

Then do the same in a new terminal so you don't have the environment changes, but source the whole oneapi setup script.

# source everything
source /shared/ucl/apps/intel/2024.2/setvars.sh
env > inteloneapi2024.2_env

You can then see only the things that were added by using a script like this (from https://www.unix.com/302281269-post7.html?s=6637ae638fba973573e4463fe2340d6c):

envdeps:

#!/bin/bash

# Use: envdeps newenv origenv

# Script to compare two files created by "env" and return the lines that are
# not in origenv. Particularly for getting intel env vars.

# NR==FNR If we read the first file.
# Define an array with the name a with the index $0 and read the next line.
# Action for the second file: !($0 in a)
# Print the line if a[$0] is not defined.

awk 'NR==FNR{a[$0];next}!($0 in a)' "$2" "$1"
./envdeps intelmpi2021.13_env origenv > diffmpi
./envdeps inteloneapi2024.2_env origenv > diffoneapi

diffoneapi will contain everything including mpi, diffmpi will contain only mpi (and no CCL).

That lets you check faster against our existing modulefile builders in the buildscript that it is setting the correct environment variables for each. You could also sort the diffs to make it easier to compare. For variables like LIBRARY_PATH you will get the whole combined path that is set up as multiple prepend-path lines in the modulefile.

This can be streamlined further. (Left as an exercise for the reader!)

heatherkellyucl commented 1 month ago

The 2023 toolkits are no longer available.

balston commented 1 week ago

We've installed Intel oneAPI 2024.0.1 with Intel MPI 2021.11 on Young using the base and HPC kit installers supplied by the user. After a correction to the module files this is now working for the user.

Will also install on the other clusters with the installers in /shared/ucl/apps/pkg-store on each cluster:

balston commented 2 days ago

Intel oneAPI 2024.0.1 is the latest version we can run on the clusters until we update the underlying OS to RedHatn9 or equivalent. All done until then.