EESSI / software-layer

Software layer of the EESSI project
https://eessi.github.io/docs/software_layer
GNU General Public License v2.0
20 stars 43 forks source link

{2023.06}[2023a] Extrae 4.0.6 (WIP) #554

Open boegel opened 2 months ago

eessi-bot-aws[bot] commented 2 months ago

Instance eessi-bot-mc-azure is configured to build:

boegel commented 2 months ago

bot: build repo:eessi.io-2023.06-software arch:aarch64/neoverse_v1

eessi-bot-aws[bot] commented 2 months ago
Updates by the bot instance eessi-bot-mc-aws (click for details) - received bot command `build repo:eessi.io-2023.06-software arch:aarch64/neoverse_v1` from `boegel` - expanded format: `build repository:eessi.io-2023.06-software architecture:aarch64/neoverse_v1` - handling command `build repository:eessi.io-2023.06-software architecture:aarch64/neoverse_v1` resulted in: - submitted job `9894`, for details & status see https://github.com/EESSI/software-layer/pull/554#issuecomment-2079576167
eessi-bot-aws[bot] commented 2 months ago
Updates by the bot instance eessi-bot-mc-azure (click for details) - received bot command `build repo:eessi.io-2023.06-software arch:aarch64/neoverse_v1` from `boegel` - expanded format: `build repository:eessi.io-2023.06-software architecture:aarch64/neoverse_v1` - handling command `build repository:eessi.io-2023.06-software architecture:aarch64/neoverse_v1` resulted in: - no jobs were submitted
eessi-bot-aws[bot] commented 2 months ago
New job on instance eessi-bot-mc-aws for architecture aarch64-neoverse_v1 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.04/pr_554/9894 date job status comment
Apr 26 15:06:28 UTC 2024 submitted job id 9894 awaits release by job manager
Apr 26 15:07:10 UTC 2024 released job awaits launch by Slurm scheduler
Apr 26 15:12:12 UTC 2024 running job 9894 is running
Apr 26 15:37:37 UTC 2024 finished
:cry: FAILURE (click triangle for details)
Reason
EasyConfig not found during missing installation check. Are you sure all PRs referenced have been merged in EasyBuild?
Details
:white_check_mark: job output file slurm-9894.out
:x: found message matching ERROR:
:x: found message matching FAILED:
:x: found message matching required modules missing:
:x: no message matching No missing installations
:white_check_mark: found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-aarch64-neoverse_v1-1714144920.tar.gzsize: 11 MiB (12416831 bytes)
entries: 421
modules under _2023.06/software/linux/aarch64/neoversev1/modules/all
elfutils/0.189-GCCcore-12.3.0.lua
libdwarf/0.7.0-GCCcore-12.3.0.lua
PAPI/7.0.1-GCCcore-12.3.0.lua
software under _2023.06/software/linux/aarch64/neoversev1/software
elfutils/0.189-GCCcore-12.3.0
libdwarf/0.7.0-GCCcore-12.3.0
PAPI/7.0.1-GCCcore-12.3.0
other under _2023.06/software/linux/aarch64/neoversev1
no other files in tarball
Apr 26 15:37:37 UTC 2024 test result
:grin: SUCCESS (click triangle for details)
ReFrame Summary
[ PASSED ] Ran 9/9 test case(s) from 9 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
:white_check_mark: job output file slurm-9894.out
:x: found message matching ERROR:
:white_check_mark: no message matching [\s*FAILED\s*].*Ran .* test case
boegel commented 2 months ago

bot: build repo:eessi.io-2023.06-software arch:x86_64/generic

eessi-bot-aws[bot] commented 2 months ago
Updates by the bot instance eessi-bot-mc-aws (click for details) - received bot command `build repo:eessi.io-2023.06-software arch:x86_64/generic` from `boegel` - expanded format: `build repository:eessi.io-2023.06-software architecture:x86_64/generic` - handling command `build repository:eessi.io-2023.06-software architecture:x86_64/generic` resulted in: - submitted job `9969`, for details & status see https://github.com/EESSI/software-layer/pull/554#issuecomment-2082209745
eessi-bot-aws[bot] commented 2 months ago
Updates by the bot instance eessi-bot-mc-azure (click for details) - received bot command `build repo:eessi.io-2023.06-software arch:x86_64/generic` from `boegel` - expanded format: `build repository:eessi.io-2023.06-software architecture:x86_64/generic` - handling command `build repository:eessi.io-2023.06-software architecture:x86_64/generic` resulted in: - no jobs were submitted
eessi-bot-aws[bot] commented 2 months ago
New job on instance eessi-bot-mc-aws for architecture x86_64-generic for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.04/pr_554/9969 date job status comment
Apr 29 09:02:20 UTC 2024 submitted job id 9969 awaits release by job manager
Apr 29 09:02:46 UTC 2024 released job awaits launch by Slurm scheduler
Apr 29 09:07:48 UTC 2024 running job 9969 is running
Apr 29 09:31:10 UTC 2024 finished
:cry: FAILURE (click triangle for details)
Reason
EasyConfig not found during missing installation check. Are you sure all PRs referenced have been merged in EasyBuild?
Details
:white_check_mark: job output file slurm-9969.out
:x: found message matching ERROR:
:x: found message matching FAILED:
:x: found message matching required modules missing:
:x: no message matching No missing installations
:white_check_mark: found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-generic-1714382099.tar.gzsize: 22 MiB (23105365 bytes)
entries: 493
modules under _2023.06/software/linux/x8664/generic/modules/all
elfutils/0.189-GCCcore-12.3.0.lua
libdwarf/0.7.0-GCCcore-12.3.0.lua
PAPI/7.0.1-GCCcore-12.3.0.lua
software under _2023.06/software/linux/x8664/generic/software
elfutils/0.189-GCCcore-12.3.0
libdwarf/0.7.0-GCCcore-12.3.0
PAPI/7.0.1-GCCcore-12.3.0
other under _2023.06/software/linux/x8664/generic
no other files in tarball
Apr 29 09:31:10 UTC 2024 test result
:grin: SUCCESS (click triangle for details)
ReFrame Summary
[ PASSED ] Ran 9/9 test case(s) from 9 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
:white_check_mark: job output file slurm-9969.out
:x: found message matching ERROR:
:white_check_mark: no message matching [\s*FAILED\s*].*Ran .* test case
boegel commented 1 month ago

./configure ... is failing with:

checking for binutils... notfound
configure: libbfd library directory: /usr/lib/x86_64-linux-gnu
configure: Warning! Cannot find the libiberty library in the given binutils home. Please, make sure that the binutils packages is correctly installed. If you have installed the binutils package by hand from their source code, make sure that libiberty is installed. Some releases of the binutils package do not install the libibery even invoking make install. The library should be within the libiberty directory within the binutils source tree.
checking for bfd.h... no
configure: error: You have asked to gather call-site information through --with-unwind which must be translated using binutils, but either libbfd or libiberty are not found. Please make sure that the binutils-dev package is installed and specify where to find these libraries through --with-binutils. The latest source can be downloaded from http://www.gnu.org/software/binutils

Despite the patch added in https://github.com/easybuilders/easybuild-easyconfigs/pull/20153, it's not able yet to find the binutils from the EESSI compat layer. I had this working somehow during the EuroHPC Summit week, but perhaps I had an extra change/fix somewhere back then which I didn't keep track of...

boegel commented 1 month ago

Including this in the Extrae easyconfig (which should be done either in the custom easyblock for Extra, or in the hooks we use in EESSI) helps:

configopts = "--with-binutils=%(sysroot)s/usr/lib*/binutils/x86_64-pc-linux-gnu/2.*/"

but then there's a problem in the build step, I don't recall hitting this one before...

configure: error: Cannot find given ${MPIF77}. Please give the full path for the MPI F77 compiler
``` checking for mpicc compiler default binary type... 64-bit configure: cannot locate multiarch triplet checking for mpicxx compiler default binary type... 64-bit configure: cannot locate multiarch triplet checking for MPI installation... /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/intel/skylake_avx512/software/OpenMPI/4.1.5-GCC-12.3.0 checking for MPI binaries directory... /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/intel/skylake_avx512/software/OpenMPI/4.1.5-GCC-12.3.0/bin checking for MPI includes directory... /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/intel/skylake_avx512/software/OpenMPI/4.1.5-GCC-12.3.0/include checking for MPI libraries directory... /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/intel/skylake_avx512/software/OpenMPI/4.1.5-GCC-12.3.0/lib64 checking for MPI shared library folder... no checking for MPI multiarch library folder... no checking for MPI valid installation... yes checking for mpi.h... yes checking for MPICH2 defined... no checking for MPI library... /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/intel/skylake_avx512/software/OpenMPI/4.1.5-GCC-12.3.0/lib64, -lmpi checking for shared MPI library... yes checking for fortran MPI library... not found, checking for MPI C compiler... mpicc checking for MPI F77 compiler... Illegal option -- ./configure: line 11789: test: too many arguments configure: error: Cannot find given ${MPIF77}. Please give the full path for the MPI F77 compiler make: *** [Makefile:1240: config.status] Error 2 ```
boegel commented 1 month ago

The issue with $MPIF77 is caused by the use of which in Extrae's configure script:

         if test -x `which ${MPIF77}` ; then
                { printf "%s\n" "$as_me:${as_lineno-$LINENO}: result: ${MPIF77}" >&5
printf "%s\n" "${MPIF77}" >&6; }
         else
                as_fn_error $? "Cannot find given \${MPIF77}. Please give the full path for the MPI F77 compiler" "$LINENO" 5
         fi

That fails in the EESSI build container, because which doesn't work there:

{EESSI 2023.06} Apptainer> which ls
Illegal option --
Usage: /usr/bin/which [-a] args

command -v can be used instead:

{EESSI 2023.06} Apptainer> command -v ls
/cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/bin/ls
boegel commented 1 month ago

bot: build repo:eessi.io-2023.06-software arch:x86_64/intel/skylake_avx512

eessi-bot-aws[bot] commented 1 month ago
Updates by the bot instance eessi-bot-mc-aws (click for details) - received bot command `build repo:eessi.io-2023.06-software arch:x86_64/intel/skylake_avx512` from `boegel` - expanded format: `build repository:eessi.io-2023.06-software architecture:x86_64/intel/skylake_avx512` - handling command `build repository:eessi.io-2023.06-software architecture:x86_64/intel/skylake_avx512` resulted in: - submitted job `10463`, for details & status see https://github.com/EESSI/software-layer/pull/554#issuecomment-2106245323
eessi-bot-aws[bot] commented 1 month ago
Updates by the bot instance eessi-bot-mc-azure (click for details) - received bot command `build repo:eessi.io-2023.06-software arch:x86_64/intel/skylake_avx512` from `boegel` - expanded format: `build repository:eessi.io-2023.06-software architecture:x86_64/intel/skylake_avx512` - handling command `build repository:eessi.io-2023.06-software architecture:x86_64/intel/skylake_avx512` resulted in: - no jobs were submitted
eessi-bot-aws[bot] commented 1 month ago
New job on instance eessi-bot-mc-aws for architecture x86_64-intel-skylake_avx512 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.05/pr_554/10463 date job status comment
May 12 13:22:41 UTC 2024 submitted job id 10463 awaits release by job manager
May 12 13:22:56 UTC 2024 released job awaits launch by Slurm scheduler
May 12 13:27:58 UTC 2024 running job 10463 is running
May 12 13:49:33 UTC 2024 finished
:cry: FAILURE (click triangle for details)
Reason
EasyConfig not found during missing installation check. Are you sure all PRs referenced have been merged in EasyBuild?
Details
:white_check_mark: job output file slurm-10463.out
:x: found message matching ERROR:
:x: found message matching FAILED:
:x: found message matching required modules missing:
:x: no message matching No missing installations
:white_check_mark: found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-intel-skylake_avx512-1715520796.tar.gzsize: 22 MiB (23270589 bytes)
entries: 494
modules under _2023.06/software/linux/x86_64/intel/skylakeavx512/modules/all
elfutils/0.189-GCCcore-12.3.0.lua
libdwarf/0.7.0-GCCcore-12.3.0.lua
PAPI/7.0.1-GCCcore-12.3.0.lua
software under _2023.06/software/linux/x86_64/intel/skylakeavx512/software
elfutils/0.189-GCCcore-12.3.0
libdwarf/0.7.0-GCCcore-12.3.0
PAPI/7.0.1-GCCcore-12.3.0
other under _2023.06/software/linux/x86_64/intel/skylakeavx512
2023.06/init/easybuild/eb_hooks.py
May 12 13:49:33 UTC 2024 test result
:grin: SUCCESS (click triangle for details)
ReFrame Summary
[ PASSED ] Ran 10/10 test case(s) from 10 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
:white_check_mark: job output file slurm-10463.out
:x: found message matching ERROR:
:white_check_mark: no message matching [\s*FAILED\s*].*Ran .* test case

edit: failed because of incorrectly disabling test suite by setting runtest to False

boegel commented 1 month ago

bot: build repo:eessi.io-2023.06-software arch:aarch64/neoverse_v1

eessi-bot-aws[bot] commented 1 month ago
Updates by the bot instance eessi-bot-mc-aws (click for details) - received bot command `build repo:eessi.io-2023.06-software arch:aarch64/neoverse_v1` from `boegel` - expanded format: `build repository:eessi.io-2023.06-software architecture:aarch64/neoverse_v1` - handling command `build repository:eessi.io-2023.06-software architecture:aarch64/neoverse_v1` resulted in: - submitted job `10465`, for details & status see https://github.com/EESSI/software-layer/pull/554#issuecomment-2106252599
eessi-bot-aws[bot] commented 1 month ago
Updates by the bot instance eessi-bot-mc-azure (click for details) - received bot command `build repo:eessi.io-2023.06-software arch:aarch64/neoverse_v1` from `boegel` - expanded format: `build repository:eessi.io-2023.06-software architecture:aarch64/neoverse_v1` - handling command `build repository:eessi.io-2023.06-software architecture:aarch64/neoverse_v1` resulted in: - no jobs were submitted
eessi-bot-aws[bot] commented 1 month ago
New job on instance eessi-bot-mc-aws for architecture aarch64-neoverse_v1 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.05/pr_554/10465 date job status comment
May 12 13:46:18 UTC 2024 submitted job id 10465 awaits release by job manager
May 12 13:46:26 UTC 2024 released job awaits launch by Slurm scheduler
May 12 13:51:38 UTC 2024 running job 10465 is running
May 12 14:18:41 UTC 2024 finished
:cry: FAILURE (click triangle for details)
Reason
EasyConfig not found during missing installation check. Are you sure all PRs referenced have been merged in EasyBuild?
Details
:white_check_mark: job output file slurm-10465.out
:x: found message matching ERROR:
:x: found message matching FAILED:
:x: found message matching required modules missing:
:x: no message matching No missing installations
:white_check_mark: found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-aarch64-neoverse_v1-1715522416.tar.gzsize: 11 MiB (12423670 bytes)
entries: 422
modules under _2023.06/software/linux/aarch64/neoversev1/modules/all
elfutils/0.189-GCCcore-12.3.0.lua
libdwarf/0.7.0-GCCcore-12.3.0.lua
PAPI/7.0.1-GCCcore-12.3.0.lua
software under _2023.06/software/linux/aarch64/neoversev1/software
elfutils/0.189-GCCcore-12.3.0
libdwarf/0.7.0-GCCcore-12.3.0
PAPI/7.0.1-GCCcore-12.3.0
other under _2023.06/software/linux/aarch64/neoversev1
2023.06/init/easybuild/eb_hooks.py
May 12 14:18:41 UTC 2024 test result
:grin: SUCCESS (click triangle for details)
ReFrame Summary
[ PASSED ] Ran 10/10 test case(s) from 10 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
:white_check_mark: job output file slurm-10465.out
:x: found message matching ERROR:
:white_check_mark: no message matching [\s*FAILED\s*].*Ran .* test case

edit: failed because of incorrectly disabling test suite by setting runtest to False

boegel commented 1 month ago

bot: build repo:eessi.io-2023.06-software arch:x86_64/intel/skylake_avx512

eessi-bot-aws[bot] commented 1 month ago
Updates by the bot instance eessi-bot-mc-aws (click for details) - received bot command `build repo:eessi.io-2023.06-software arch:x86_64/intel/skylake_avx512` from `boegel` - expanded format: `build repository:eessi.io-2023.06-software architecture:x86_64/intel/skylake_avx512` - handling command `build repository:eessi.io-2023.06-software architecture:x86_64/intel/skylake_avx512` resulted in: - submitted job `10467`, for details & status see https://github.com/EESSI/software-layer/pull/554#issuecomment-2106256487
eessi-bot-aws[bot] commented 1 month ago
Updates by the bot instance eessi-bot-mc-azure (click for details) - received bot command `build repo:eessi.io-2023.06-software arch:x86_64/intel/skylake_avx512` from `boegel` - expanded format: `build repository:eessi.io-2023.06-software architecture:x86_64/intel/skylake_avx512` - handling command `build repository:eessi.io-2023.06-software architecture:x86_64/intel/skylake_avx512` resulted in: - no jobs were submitted
eessi-bot-aws[bot] commented 1 month ago
New job on instance eessi-bot-mc-aws for architecture x86_64-intel-skylake_avx512 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.05/pr_554/10467 date job status comment
May 12 14:00:38 UTC 2024 submitted job id 10467 awaits release by job manager
May 12 14:01:03 UTC 2024 released job awaits launch by Slurm scheduler
May 12 14:02:08 UTC 2024 running job 10467 is running
May 12 14:20:44 UTC 2024 finished
:cry: FAILURE (click triangle for details)
Details
:white_check_mark: job output file slurm-10467.out
:x: found message matching ERROR:
:white_check_mark: no message matching FAILED:
:x: found message matching required modules missing:
:x: no message matching No missing installations
:white_check_mark: found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-intel-skylake_avx512-1715522658.tar.gzsize: 0 MiB (9681 bytes)
entries: 1
modules under _2023.06/software/linux/x86_64/intel/skylakeavx512/modules/all
no module files in tarball
software under _2023.06/software/linux/x86_64/intel/skylakeavx512/software
no software packages in tarball
other under _2023.06/software/linux/x86_64/intel/skylakeavx512
2023.06/init/easybuild/eb_hooks.py
May 12 14:20:44 UTC 2024 test result
:grin: SUCCESS (click triangle for details)
ReFrame Summary
[ PASSED ] Ran 10/10 test case(s) from 10 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
:white_check_mark: job output file slurm-10467.out
:x: found message matching ERROR:
:white_check_mark: no message matching [\s*FAILED\s*].*Ran .* test case

edit: failed due to Failed to get data for PR #20050 from easybuilders/easybuild-easyconfigs (HTTP Error 403: rate limit exceeded)

boegel commented 1 month ago

bot: build repo:eessi.io-2023.06-software arch:x86_64/intel/skylake_avx512

eessi-bot-aws[bot] commented 1 month ago
Updates by the bot instance eessi-bot-mc-aws (click for details) - received bot command `build repo:eessi.io-2023.06-software arch:x86_64/intel/skylake_avx512` from `boegel` - expanded format: `build repository:eessi.io-2023.06-software architecture:x86_64/intel/skylake_avx512` - handling command `build repository:eessi.io-2023.06-software architecture:x86_64/intel/skylake_avx512` resulted in: - submitted job `10468`, for details & status see https://github.com/EESSI/software-layer/pull/554#issuecomment-2106382131
eessi-bot-aws[bot] commented 1 month ago
Updates by the bot instance eessi-bot-mc-azure (click for details) - received bot command `build repo:eessi.io-2023.06-software arch:x86_64/intel/skylake_avx512` from `boegel` - expanded format: `build repository:eessi.io-2023.06-software architecture:x86_64/intel/skylake_avx512` - handling command `build repository:eessi.io-2023.06-software architecture:x86_64/intel/skylake_avx512` resulted in: - no jobs were submitted
eessi-bot-aws[bot] commented 1 month ago
New job on instance eessi-bot-mc-aws for architecture x86_64-intel-skylake_avx512 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.05/pr_554/10468 date job status comment
May 12 21:38:29 UTC 2024 submitted job id 10468 awaits release by job manager
May 12 21:39:20 UTC 2024 released job awaits launch by Slurm scheduler
May 12 21:43:22 UTC 2024 running job 10468 is running
May 12 22:09:48 UTC 2024 finished
:cry: FAILURE (click triangle for details)
Reason
EasyConfig not found during missing installation check. Are you sure all PRs referenced have been merged in EasyBuild?
Details
:white_check_mark: job output file slurm-10468.out
:x: found message matching ERROR:
:x: found message matching FAILED:
:x: found message matching required modules missing:
:x: no message matching No missing installations
:white_check_mark: found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-intel-skylake_avx512-1715550818.tar.gzsize: 22 MiB (23269634 bytes)
entries: 494
modules under _2023.06/software/linux/x86_64/intel/skylakeavx512/modules/all
elfutils/0.189-GCCcore-12.3.0.lua
libdwarf/0.7.0-GCCcore-12.3.0.lua
PAPI/7.0.1-GCCcore-12.3.0.lua
software under _2023.06/software/linux/x86_64/intel/skylakeavx512/software
elfutils/0.189-GCCcore-12.3.0
libdwarf/0.7.0-GCCcore-12.3.0
PAPI/7.0.1-GCCcore-12.3.0
other under _2023.06/software/linux/x86_64/intel/skylakeavx512
2023.06/init/easybuild/eb_hooks.py
May 12 22:09:48 UTC 2024 test result
:grin: SUCCESS (click triangle for details)
ReFrame Summary
[ PASSED ] Ran 10/10 test case(s) from 10 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
:white_check_mark: job output file slurm-10468.out
:x: found message matching ERROR:
:white_check_mark: no message matching [\s*FAILED\s*].*Ran .* test case
ocaisa commented 1 day ago

We should be able to update this now for https://github.com/easybuilders/easybuild-easyconfigs/pull/20690 and easyblock PR 3339