EESSI / software-layer

Software layer of the EESSI project
https://eessi.github.io/docs/software_layer
GNU General Public License v2.0
23 stars 46 forks source link

Workaround to fix broken curl installation #623

Closed hvelab closed 2 months ago

hvelab commented 3 months ago

As reported in the EESSI support portal issue #25, curl fails in RHEL 8 and above systems as CA files location differs. Works for all rhel based systems as I could reproduce the issue also in them.

$ cat /etc/os-release 
NAME="Rocky Linux"
VERSION="9.3 (Blue Onyx)"
ID="rocky"
ID_LIKE="rhel centos fedora"

$ source software-layer/init/bash 
Found EESSI repo @ /cvmfs/software.eessi.io/versions/2023.06!
archdetect says x86_64/intel/skylake_avx512
Using x86_64/intel/skylake_avx512 as software subdirectory.
[...]
Environment set up to use EESSI (2023.06), have fun!

{EESSI 2023.06} $ ml CMake
{EESSI 2023.06} $ curl https://www.example.com
<!doctype html>
<html>
<head>
    <title>Example Domain</title>
[...]

I am not a fan of hardcording in the init script but investigating the issue I haven't found a more suitable way to solve it for our use case. Maybe another solution would be to add a modlua footer exporting this variable in the affected modules?

eessi-bot[bot] commented 3 months ago

Instance eessi-bot-mc-aws is configured to build for:

eessi-build-deploy-bot-deucalion[bot] commented 3 months ago

Instance boegel-bot-deucalion is configured to build for:

eessi-bot[bot] commented 3 months ago

Instance eessi-bot-mc-azure is configured to build for:

ocaisa commented 3 months ago

Currently we ship curl in the compat layer, so there's no way to avoid setting this in the initialisation script. If it ever came to be that we stopped doing that, then it would be enough to set this via an Lmod hook (this may be worth a comment in the init script where this is done)

xinan1911 commented 2 months ago

@ocaisa Could you take a final check before merging this PR?

ocaisa commented 2 months ago

I can't merge this because I don't know how to deploy it

bedroge commented 2 months ago

I'm not completely sure anymore if this can now be done with the bot (we did make some changes related to including the init scripts). Let's try it, and otherwise I can do it manually.

bedroge commented 2 months ago

bot: build repo:eessi.io-2023.06-software arch:x86_64/generic

eessi-bot[bot] commented 2 months ago
Updates by the bot instance eessi-bot-mc-aws (click for details) - received bot command `build repo:eessi.io-2023.06-software arch:x86_64/generic` from `bedroge` - expanded format: `build repository:eessi.io-2023.06-software architecture:x86_64/generic` - handling command `build repository:eessi.io-2023.06-software architecture:x86_64/generic` resulted in: - submitted job `14519`, for details & status see https://github.com/EESSI/software-layer/pull/623#issuecomment-2230248346
eessi-bot[bot] commented 2 months ago
Updates by the bot instance eessi-bot-mc-azure (click for details) - received bot command `build repo:eessi.io-2023.06-software arch:x86_64/generic` from `bedroge` - expanded format: `build repository:eessi.io-2023.06-software architecture:x86_64/generic` - handling command `build repository:eessi.io-2023.06-software architecture:x86_64/generic` resulted in: - no jobs were submitted
eessi-bot[bot] commented 2 months ago
New job on instance eessi-bot-mc-aws for architecture x86_64-generic for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.07/pr_623/14519 date job status comment
Jul 16 07:50:03 UTC 2024 submitted job id 14519 awaits release by job manager
Jul 16 07:50:26 UTC 2024 released job awaits launch by Slurm scheduler
Jul 16 11:44:04 UTC 2024 running job 14519 is running
Jul 16 11:56:25 UTC 2024 finished
:grin: SUCCESS (click triangle for details)
Details
:white_check_mark: job output file slurm-14519.out
:white_check_mark: no message matching ERROR:
:white_check_mark: no message matching FAILED:
:white_check_mark: no message matching required modules missing:
:white_check_mark: found message(s) matching No missing installations
:white_check_mark: found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-generic-1721130246.tar.gzsize: 0 MiB (2121 bytes)
entries: 1
modules under _2023.06/software/linux/x8664/generic/modules/all
no module files in tarball
software under _2023.06/software/linux/x8664/generic/software
no software packages in tarball
other under _2023.06/software/linux/x8664/generic
2023.06/init/eessi_environment_variables
Jul 16 11:56:25 UTC 2024 test result
:grin: SUCCESS (click triangle for details)
ReFrame Summary
[ PASSED ] Ran 14/14 test case(s) from 14 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
:white_check_mark: job output file slurm-14519.out
:white_check_mark: no message matching ERROR:
:white_check_mark: no message matching [\s*FAILED\s*].*Ran .* test case
Jul 16 12:01:55 UTC 2024 uploaded transfer of eessi-2023.06-software-linux-x86_64-generic-1721130246.tar.gz to S3 bucket succeeded
bedroge commented 2 months ago

Looks like it actually worked :tada:.

I also tested the fix on our Rocky 8 cluster: I got an error without the fix, but after applying the fix the curl command worked fine.