Closed sjpb closed 3 months ago
FAILED Fat image build: https://github.com/stackhpc/ansible-slurm-appliance/actions/runs/7643335283
Edit: had wrong image type in glance
Tests at 0982f41 on a "local" cluster:
ondemand exporter:
[rocky@rl9-login-0 ~]$ systemctl status ondemand_exporter.service
Jan 24 16:26:24 rl9-login-0.rl9.invalid ondemand_exporter[36589]: ts=2024-01-24T16:26:24.868Z caller=collector.go:171 level=error msg="Error collecting apache information" err="Get \"http://localhost:81/server-status\":>
[rocky@rl9-login-0 ~]$ cat /usr/lib/systemd/system/ondemand_exporter.service
Environment="APACHE_STATUS_URL=http://localhost:81/server-status"
[rocky@rl9-login-0 ~]$ curl localhost:9301/metrics
# shows this is working at least
Fat image build: https://github.com/stackhpc/ansible-slurm-appliance/actions/runs/7653662893
Edit: currently failing due to CVMFS repo 503-ing Edit: repo appears up, retrying
Checked locally that e5608d9 works on both a) a cluster with existing non-system users b) fresh image
NB: Currently CI doens't get past the os-manila-mount install task b/c the rpm-reef URL at https://download.ceph.com/ has been broken/renamed.
Edit: see https://tracker.ceph.com/issues/64718
Repos fixed, lets try again
Rebuilding fat image: https://github.com/stackhpc/ansible-slurm-appliance/actions/runs/8263123087
Built
Tests at 43d43f2 on "local" cluster:
Make the appliance compatible with RockyLinux 9-based images.
Note that the CI and CaaS environments will continue to use RL8 at present. CI is only carried out using RL9 if a PR branch name starts with
rl9
or RL9 is selected when running CI workflows manually.Additional notes:
--cgroup-manager=cgroupfs
. This was demonstrated to be the default in RL8; for RL9 the default issystemd
which leads to log warnings like:However enabling user-lingering leads to mysql failing to start at all, with the relevant error probably being:
Replaces #323