EESSI / software-layer

Software layer of the EESSI project
https://eessi.github.io/docs/software_layer
GNU General Public License v2.0
20 stars 43 forks source link

{2023.06,zen4}[system] EasyBuild v4.9.1 #547

Closed trz42 closed 1 month ago

trz42 commented 2 months ago

Add EasyBuild v4.9.1 to zen4 software subdir.

Only build for zen4 on cluster ...

SPDX license identifier: GPL-2.0-only

Missing packages:

* EasyBuild/4.9.1 (EasyBuild-4.9.1.eb)
eessi-bot-aws[bot] commented 2 months ago

Instance eessi-bot-mc-aws is configured to build:

trz42 commented 2 months ago

No response from bot on Azure yet :cry: ... checking if it receives an event.

trz42 commented 2 months ago

Seems some issue with the bot config file

json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 3 column 70 (char 133)
trz42 commented 2 months ago

Fixed issue for both arch_target_map and repo_target_map

bot: show_config

eessi-bot-aws[bot] commented 2 months ago
Updates by the bot instance eessi-bot-mc-aws (click for details) - received bot command `show_config` from `trz42` - expanded format: `show_config` - handling command `show_config` resulted in: - added comment https://github.com/EESSI/software-layer/pull/547#issuecomment-2067558220 to show configuration
eessi-bot-aws[bot] commented 2 months ago
Updates by the bot instance eessi-bot-mc-azure (click for details) - received bot command `show_config` from `trz42` - expanded format: `show_config` - handling command `show_config` resulted in: - added comment https://github.com/EESSI/software-layer/pull/547#issuecomment-2067558226 to show configuration
eessi-bot-aws[bot] commented 2 months ago

Instance eessi-bot-mc-aws is configured to build:

eessi-bot-aws[bot] commented 2 months ago

Instance eessi-bot-mc-azure is configured to build:

trz42 commented 2 months ago

bot: build inst:eessi-bot-mc-azure repo:eessi.io-2023.06-software arch:x86_64/amd/zen4

eessi-bot-aws[bot] commented 2 months ago
Updates by the bot instance eessi-bot-mc-aws (click for details) - received bot command `build inst:eessi-bot-mc-azure repo:eessi.io-2023.06-software arch:x86_64/amd/zen4` from `trz42` - expanded format: `build instance:eessi-bot-mc-azure repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4` - handling command `build instance:eessi-bot-mc-azure repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4` resulted in: - no jobs were submitted
eessi-bot-aws[bot] commented 2 months ago
Updates by the bot instance eessi-bot-mc-azure (click for details) - received bot command `build inst:eessi-bot-mc-azure repo:eessi.io-2023.06-software arch:x86_64/amd/zen4` from `trz42` - expanded format: `build instance:eessi-bot-mc-azure repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4` - handling command `build instance:eessi-bot-mc-azure repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4` resulted in: - no jobs were submitted
trz42 commented 2 months ago

No jobs were submitted ... hmm ... found another glitch in the config. Should be fixed.

bot: build inst:eessi-bot-mc-azure repo:eessi.io-2023.06-software arch:x86_64/amd/zen4

eessi-bot-aws[bot] commented 2 months ago
Updates by the bot instance eessi-bot-mc-aws (click for details) - received bot command `build inst:eessi-bot-mc-azure repo:eessi.io-2023.06-software arch:x86_64/amd/zen4` from `trz42` - expanded format: `build instance:eessi-bot-mc-azure repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4` - handling command `build instance:eessi-bot-mc-azure repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4` resulted in: - no jobs were submitted
eessi-bot-aws[bot] commented 2 months ago
Updates by the bot instance eessi-bot-mc-azure (click for details) - received bot command `build inst:eessi-bot-mc-azure repo:eessi.io-2023.06-software arch:x86_64/amd/zen4` from `trz42` - expanded format: `build instance:eessi-bot-mc-azure repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4`
trz42 commented 2 months ago

Getting closer. However, hit another issue

RuntimeError: run_cmd(): Error running '/opt/software/slurm/bin/sbatch --hold --time=24:0:0 --nodes=1 --ntasks-per-node=16 --hold  --partition x86-64-amd-zen4-node /home/bot/eessi-bot-software-layer/scripts/bot-build.slurm' in '/project/def-users/SHARED/jobs/2024.04/pr_547/event_f4520850-fed5-11ee-9e5c-133a4f004b3e/run_000/linux_x86_64_amd_zen4/eessi.io-2023.06-software
           stdout ''
           stderr 'sbatch: error: Batch job submission failed: Invalid account or account/partition combination specified
'
           exit code 1

A simple srun ... fails too

srun --partition x86-64-amd-zen4-node --pty bash
srun: error: Unable to allocate resources: Invalid account or account/partition combination specified
boegel commented 2 months ago

@trz42 Seems fixed after a restart of Slurm service on mgmt node, please try again

trz42 commented 2 months ago

bot: build inst:eessi-bot-mc-azure repo:eessi.io-2023.06-software arch:x86_64/amd/zen4

eessi-bot-aws[bot] commented 2 months ago
Updates by the bot instance eessi-bot-mc-aws (click for details) - received bot command `build inst:eessi-bot-mc-azure repo:eessi.io-2023.06-software arch:x86_64/amd/zen4` from `trz42` - expanded format: `build instance:eessi-bot-mc-azure repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4` - handling command `build instance:eessi-bot-mc-azure repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4` resulted in: - no jobs were submitted
eessi-bot-aws[bot] commented 2 months ago
Updates by the bot instance eessi-bot-mc-azure (click for details) - received bot command `build inst:eessi-bot-mc-azure repo:eessi.io-2023.06-software arch:x86_64/amd/zen4` from `trz42` - expanded format: `build instance:eessi-bot-mc-azure repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4` - handling command `build instance:eessi-bot-mc-azure repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4` resulted in: - submitted job `52`, for details & status see https://github.com/EESSI/software-layer/pull/547#issuecomment-2067595367
eessi-bot-aws[bot] commented 2 months ago

New job on instance eessi-bot-mc-azure for architecture x86_64-amd-zen4 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.04/pr_547/52

date job status comment
Apr 20 07:55:42 UTC 2024 submitted job id 52 awaits release by job manager
Apr 20 07:56:19 UTC 2024 released job awaits launch by Slurm scheduler
Apr 20 07:57:22 UTC 2024 running job 52 is running
Apr 20 08:23:54 UTC 2024 finished
:cry: FAILURE (click triangle for details)
Details
:white_check_mark: job output file slurm-52.out
:x: found message matching ERROR:
:white_check_mark: no message matching FAILED:
:white_check_mark: no message matching required modules missing:
:white_check_mark: found message(s) matching No missing installations
:white_check_mark: found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-amd-zen4-1713601135.tar.gzsize: 19 MiB (20841441 bytes)
entries: 32613
modules under _2023.06/software/linux/x8664/amd/zen4/modules/all
EasyBuild/4.9.1.lua
software under _2023.06/software/linux/x8664/amd/zen4/software
EasyBuild/4.9.1
other under _2023.06/software/linux/x8664/amd/zen4
.lmod/SitePackage.lua
.lmod/lmodrc.lua
Apr 20 08:23:54 UTC 2024 test result
:cry: FAILURE (click triangle for details)
Reason
EESSI test suite produced failures.
ReFrame Summary
[ FAILED ] Ran 9/9 test case(s) from 9 check(s) (6 failure(s), 0 skipped, 0 aborted)
Details
:white_check_mark: job output file slurm-52.out
:x: found message matching ERROR:
:x: found message matching [\s*FAILED\s*].*Ran .* test case
trz42 commented 2 months ago

Rebuilding after change which tries to ensure lmod cfg files are created early if they don't exist.

bot: build inst:eessi-bot-mc-azure repo:eessi.io-2023.06-software arch:x86_64/amd/zen4

eessi-bot-aws[bot] commented 2 months ago
Updates by the bot instance eessi-bot-mc-aws (click for details) - received bot command `build inst:eessi-bot-mc-azure repo:eessi.io-2023.06-software arch:x86_64/amd/zen4` from `trz42` - expanded format: `build instance:eessi-bot-mc-azure repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4` - handling command `build instance:eessi-bot-mc-azure repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4` resulted in: - no jobs were submitted
eessi-bot-aws[bot] commented 2 months ago
Updates by the bot instance eessi-bot-mc-azure (click for details) - received bot command `build inst:eessi-bot-mc-azure repo:eessi.io-2023.06-software arch:x86_64/amd/zen4` from `trz42` - expanded format: `build instance:eessi-bot-mc-azure repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4` - handling command `build instance:eessi-bot-mc-azure repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4` resulted in: - submitted job `55`, for details & status see https://github.com/EESSI/software-layer/pull/547#issuecomment-2067655477
eessi-bot-aws[bot] commented 2 months ago
New job on instance eessi-bot-mc-azure for architecture x86_64-amd-zen4 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.04/pr_547/55 date job status comment
Apr 20 12:14:32 UTC 2024 submitted job id 55 awaits release by job manager
Apr 20 12:15:19 UTC 2024 released job awaits launch by Slurm scheduler
Apr 20 12:16:22 UTC 2024 running job 55 is running
Apr 20 12:18:25 UTC 2024 finished
:shrug: UNKNOWN (click triangle for detailed information)
  • Job results file _bot_job55.result does not exist in job directory or reading it failed.
  • No artefacts were found/reported.
Apr 20 12:18:25 UTC 2024 test result
:shrug: UNKNOWN (click triangle for detailed information)
  • Job test file _bot_job55.test does not exist in job directory or reading it failed.
trz42 commented 2 months ago

Small mistake (wrong script used) fixed

bot: build inst:eessi-bot-mc-azure repo:eessi.io-2023.06-software arch:x86_64/amd/zen4

eessi-bot-aws[bot] commented 2 months ago
Updates by the bot instance eessi-bot-mc-aws (click for details) - received bot command `build inst:eessi-bot-mc-azure repo:eessi.io-2023.06-software arch:x86_64/amd/zen4` from `trz42` - expanded format: `build instance:eessi-bot-mc-azure repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4` - handling command `build instance:eessi-bot-mc-azure repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4` resulted in: - no jobs were submitted
eessi-bot-aws[bot] commented 2 months ago
Updates by the bot instance eessi-bot-mc-azure (click for details) - received bot command `build inst:eessi-bot-mc-azure repo:eessi.io-2023.06-software arch:x86_64/amd/zen4` from `trz42` - expanded format: `build instance:eessi-bot-mc-azure repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4` - handling command `build instance:eessi-bot-mc-azure repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4` resulted in: - submitted job `56`, for details & status see https://github.com/EESSI/software-layer/pull/547#issuecomment-2067656744
eessi-bot-aws[bot] commented 2 months ago
New job on instance eessi-bot-mc-azure for architecture x86_64-amd-zen4 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.04/pr_547/56 date job status comment
Apr 20 12:19:42 UTC 2024 submitted job id 56 awaits release by job manager
Apr 20 12:20:28 UTC 2024 released job awaits launch by Slurm scheduler
Apr 20 12:21:31 UTC 2024 running job 56 is running
Apr 20 12:27:38 UTC 2024 finished
:cry: FAILURE (click triangle for details)
Details
:white_check_mark: job output file slurm-56.out
:x: found message matching ERROR:
:white_check_mark: no message matching FAILED:
:white_check_mark: no message matching required modules missing:
:white_check_mark: found message(s) matching No missing installations
:white_check_mark: found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-amd-zen4-1713615752.tar.gzsize: 19 MiB (20841262 bytes)
entries: 32613
modules under _2023.06/software/linux/x8664/amd/zen4/modules/all
EasyBuild/4.9.1.lua
software under _2023.06/software/linux/x8664/amd/zen4/software
EasyBuild/4.9.1
other under _2023.06/software/linux/x8664/amd/zen4
.lmod/SitePackage.lua
.lmod/lmodrc.lua
Apr 20 12:27:38 UTC 2024 test result
:cry: FAILURE (click triangle for details)
Reason
EESSI test suite produced failures.
ReFrame Summary
[ FAILED ] Ran 9/9 test case(s) from 9 check(s) (6 failure(s), 0 skipped, 0 aborted)
Details
:white_check_mark: job output file slurm-56.out
:x: found message matching ERROR:
:x: found message matching [\s*FAILED\s*].*Ran .* test case
trz42 commented 2 months ago

bot: build inst:eessi-bot-mc-azure repo:eessi.io-2023.06-software arch:x86_64/amd/zen4

eessi-bot-aws[bot] commented 2 months ago
Updates by the bot instance eessi-bot-mc-aws (click for details) - received bot command `build inst:eessi-bot-mc-azure repo:eessi.io-2023.06-software arch:x86_64/amd/zen4` from `trz42` - expanded format: `build instance:eessi-bot-mc-azure repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4` - handling command `build instance:eessi-bot-mc-azure repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4` resulted in: - no jobs were submitted
eessi-bot-aws[bot] commented 2 months ago
Updates by the bot instance eessi-bot-mc-azure (click for details) - received bot command `build inst:eessi-bot-mc-azure repo:eessi.io-2023.06-software arch:x86_64/amd/zen4` from `trz42` - expanded format: `build instance:eessi-bot-mc-azure repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4` - handling command `build instance:eessi-bot-mc-azure repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4` resulted in: - submitted job `57`, for details & status see https://github.com/EESSI/software-layer/pull/547#issuecomment-2067661893
eessi-bot-aws[bot] commented 2 months ago
New job on instance eessi-bot-mc-azure for architecture x86_64-amd-zen4 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.04/pr_547/57 date job status comment
Apr 20 12:41:35 UTC 2024 submitted job id 57 awaits release by job manager
Apr 20 12:41:43 UTC 2024 released job awaits launch by Slurm scheduler
Apr 20 12:42:46 UTC 2024 running job 57 is running
Apr 20 12:45:51 UTC 2024 finished
:cry: FAILURE (click triangle for details)
Details
:white_check_mark: job output file slurm-57.out
:white_check_mark: no message matching ERROR:
:white_check_mark: no message matching FAILED:
:white_check_mark: no message matching required modules missing:
:x: no message matching No missing installations
:white_check_mark: found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-amd-zen4-1713616911.tar.gzsize: 0 MiB (2500 bytes)
entries: 2
modules under _2023.06/software/linux/x8664/amd/zen4/modules/all
no module files in tarball
software under _2023.06/software/linux/x8664/amd/zen4/software
no software packages in tarball
other under _2023.06/software/linux/x8664/amd/zen4
.lmod/SitePackage.lua
.lmod/lmodrc.lua
Apr 20 12:45:51 UTC 2024 test result
:shrug: UNKNOWN (click triangle for detailed information)
  • Job test file _bot_job57.test does not exist in job directory or reading it failed.
trz42 commented 2 months ago

bot: build inst:eessi-bot-mc-azure repo:eessi.io-2023.06-software arch:x86_64/amd/zen4

eessi-bot-aws[bot] commented 2 months ago
Updates by the bot instance eessi-bot-mc-aws (click for details) - received bot command `build inst:eessi-bot-mc-azure repo:eessi.io-2023.06-software arch:x86_64/amd/zen4` from `trz42` - expanded format: `build instance:eessi-bot-mc-azure repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4` - handling command `build instance:eessi-bot-mc-azure repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4` resulted in: - no jobs were submitted
eessi-bot-aws[bot] commented 2 months ago
Updates by the bot instance eessi-bot-mc-azure (click for details) - received bot command `build inst:eessi-bot-mc-azure repo:eessi.io-2023.06-software arch:x86_64/amd/zen4` from `trz42` - expanded format: `build instance:eessi-bot-mc-azure repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4` - handling command `build instance:eessi-bot-mc-azure repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4` resulted in: - submitted job `58`, for details & status see https://github.com/EESSI/software-layer/pull/547#issuecomment-2067662951
eessi-bot-aws[bot] commented 2 months ago
New job on instance eessi-bot-mc-azure for architecture x86_64-amd-zen4 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.04/pr_547/58 date job status comment
Apr 20 12:46:02 UTC 2024 submitted job id 58 awaits release by job manager
Apr 20 12:46:54 UTC 2024 released job awaits launch by Slurm scheduler
Apr 20 12:47:57 UTC 2024 running job 58 is running
Apr 20 12:54:10 UTC 2024 finished
:grin: SUCCESS (click triangle for details)
Details
:white_check_mark: job output file slurm-58.out
:white_check_mark: no message matching ERROR:
:white_check_mark: no message matching FAILED:
:white_check_mark: no message matching required modules missing:
:white_check_mark: found message(s) matching No missing installations
:white_check_mark: found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-amd-zen4-1713617341.tar.gzsize: 19 MiB (20848896 bytes)
entries: 32613
modules under _2023.06/software/linux/x8664/amd/zen4/modules/all
EasyBuild/4.9.1.lua
software under _2023.06/software/linux/x8664/amd/zen4/software
EasyBuild/4.9.1
other under _2023.06/software/linux/x8664/amd/zen4
.lmod/SitePackage.lua
.lmod/lmodrc.lua
Apr 20 12:54:10 UTC 2024 test result
:cry: FAILURE (click triangle for details)
Reason
EESSI test suite produced failures.
ReFrame Summary
[ FAILED ] Ran 9/9 test case(s) from 9 check(s) (6 failure(s), 0 skipped, 0 aborted)
Details
:white_check_mark: job output file slurm-58.out
:x: found message matching ERROR:
:x: found message matching [\s*FAILED\s*].*Ran .* test case
boegel commented 1 month ago

@trz42 Seems ready to deploy?

boegel commented 1 month ago

bot: build repo:eessi.io-2023.06-software arch:x86_64/amd/zen4

eessi-bot-aws[bot] commented 1 month ago
Updates by the bot instance eessi-bot-mc-aws (click for details) - received bot command `build repo:eessi.io-2023.06-software arch:x86_64/amd/zen4` from `boegel` - expanded format: `build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4` - handling command `build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4` resulted in: - no jobs were submitted
eessi-bot-aws[bot] commented 1 month ago
Updates by the bot instance eessi-bot-mc-azure (click for details) - received bot command `build repo:eessi.io-2023.06-software arch:x86_64/amd/zen4` from `boegel` - expanded format: `build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4` - handling command `build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4` resulted in: - submitted job `68`, for details & status see https://github.com/EESSI/software-layer/pull/547#issuecomment-2098040799
eessi-bot-aws[bot] commented 1 month ago
New job on instance eessi-bot-mc-azure for architecture x86_64-amd-zen4 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.05/pr_547/68 date job status comment
May 07 10:39:32 UTC 2024 submitted job id 68 awaits release by job manager
May 07 10:40:14 UTC 2024 released job awaits launch by Slurm scheduler
May 07 10:45:17 UTC 2024 running job 68 is running
May 07 10:53:29 UTC 2024 finished
:grin: SUCCESS (click triangle for details)
Details
:white_check_mark: job output file slurm-68.out
:white_check_mark: no message matching ERROR:
:white_check_mark: no message matching FAILED:
:white_check_mark: no message matching required modules missing:
:white_check_mark: found message(s) matching No missing installations
:white_check_mark: found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-amd-zen4-1715078839.tar.gzsize: 19 MiB (20845408 bytes)
entries: 32614
modules under _2023.06/software/linux/x8664/amd/zen4/modules/all
EasyBuild/4.9.1.lua
software under _2023.06/software/linux/x8664/amd/zen4/software
EasyBuild/4.9.1
other under _2023.06/software/linux/x8664/amd/zen4
.lmod/SitePackage.lua
.lmod/lmodrc.lua
2023.06/init/arch_specs/eessi_arch_x86.spec
May 07 10:53:29 UTC 2024 test result
:cry: FAILURE (click triangle for details)
Reason
EESSI test suite produced failures.
ReFrame Summary
[ FAILED ] Ran 10/10 test case(s) from 10 check(s) (7 failure(s), 0 skipped, 0 aborted)
Details
:white_check_mark: job output file slurm-68.out
:x: found message matching ERROR:
:x: found message matching [\s*FAILED\s*].*Ran .* test case
May 07 10:55:20 UTC 2024 uploaded transfer of eessi-2023.06-software-linux-x86_64-amd-zen4-1715078839.tar.gz to S3 bucket succeeded
boegel commented 1 month ago

rebuilding to pick up fix for archspec that was merged in https://github.com/EESSI/software-layer/pull/451 but not deployed yet...