clusterinthecloud / support

If you need help with Cluster in the Cloud, this is the right place
2 stars 0 forks source link

Slurm doesn't install any more when building images on Oracle Linux #44

Closed chryswoods closed 2 years ago

chryswoods commented 2 years ago

sudo /usr/local/bin/run_packer fails when installing slurm with the error

oracle-oci.oracle: failed: [default] (item=slurmd) => {"ansible_loop_var": "item", "changed": false, "item": "slurmd", "msg": "Failed to download metadata for repo 'slurm': Yum repo downloading error: Downloading error(s): repodata/a8cffd16ca77c1c82a096e274a8733a4defb0fa64a3baeb523ea8004f6eeb40c-primary.xml.gz - Cannot download, all mirrors were already tried without success; repodata/139786b1725c3cc419c37f6851e3921a44f53ef9e1e7cacd23f864a751788aa0-filelists.xml.gz - Cannot download, all mirrors were already tried without success", "rc": 1, "results": []}

Looking into this, I think that the slurmd package is now called slurm-slurmd on Oracle Linux. Updating slurm_packages in /root/citc-ansible/roles/slurm/molecule/compute/molecule.yml to read

        slurm_packages:
          - slurm-slurmd

allows slurm to be installed and the image to build. I couldn't find an equivalent for libpmi, but a slurm-freeipmipackage is installed as a dependency ofslurm-slurmd`.

Is this the right fix?

milliams commented 2 years ago

This seemed to be an ephemeral issue with the download mirror for the package, rather than anything wrong with the package itself.