Closed jamesbeedy closed 1 month ago
I'm slightly worried about deployment times if we start installing all common packages, even if the user only wants a minimal Slurm deployment. Some brainstorming:
hey @jedel1043 ,
On a local vm, pulling the packages over the wan, it took 15.06 seconds to run sudo apt install openmpi-bin libpmix-dev nfs-common -y
.
Installed sizes (by apt-cache show): openmpi-bin: Installed-Size: 554 nfs-common: Installed-Size: 885 libpmix-dev: Installed-Size: 4640
$ sudo apt-cache show openmpi-bin
Package: openmpi-bin
Architecture: amd64
Version: 4.1.2-2ubuntu1
Multi-Arch: foreign
Priority: optional
Section: universe/net
Source: openmpi
Origin: Ubuntu
Maintainer: Ubuntu Developers <ubuntu-devel-discuss@lists.ubuntu.com>
Original-Maintainer: Alastair McKinstry <mckinstry@debian.org>
Bugs: https://bugs.launchpad.net/ubuntu/+filebug
Installed-Size: 554
Depends: libc6 (>= 2.34), libevent-core-2.1-7 (>= 2.1.8-stable), libopenmpi3 (>= 4.1.2), openmpi-common (>= 4.1.2-2ubuntu1), openssh-client | ssh-client
Suggests: gfortran | fortran-compiler
Conflicts: openmpi-bin
Breaks: lam4-dev (<< 7.1.4-4), libmpich-dev (<< 3.3~b1-5), libopenmpi-dev (<< 4.0.5-3), mpich (<< 3.3~b1-5)
Replaces: libopenmpi-dev (<< 4.0.5-3)
Filename: pool/universe/o/openmpi/openmpi-bin_4.1.2-2ubuntu1_amd64.deb
Size: 115648
MD5sum: 439e62bab3f4823c374122f3cff0b5da
SHA1: a183e473ea7cb9d2b6d7a8834d4c52fef16b9111
SHA256: a6fa0b564def24fa9ecbfb7f2419e929d74847fc3c2146ea4c38e7be6369ab79
SHA512: 4f4c89fe71fc218bf90e93c32a7f865074daada9479af4c5edd1f305d547b0431e6a527707075160c5f5192d6b9667f82fcbae3fcd1b82a5631e9e8abb386445
Homepage: https://www.open-mpi.org/
Description-en: high performance message passing library -- binaries
Open MPI is a project combining technologies and resources from several other
projects (FT-MPI, LA-MPI, LAM/MPI, and PACX-MPI) in order to build the best
MPI library available. A completely new MPI-3.1 compliant implementation, Open
MPI offers advantages for system and software vendors, application developers
and computer science researchers.
.
Features:
* Full MPI-3.1 standards conformance
* Thread safety and concurrency
* Dynamic process spawning
* High performance on all platforms
* Reliable and fast job management
* Network and process fault tolerance
* Support network heterogeneity
* Single library supports all networks
* Run-time instrumentation
* Many job schedulers supported
* Internationalized error messages
* Component-based design, documented APIs
.
This package contains the Open MPI utility programs.
Description-md5: 1a00d4dd7be41a0a9fd2a922b4135736
ubuntu@juju-e60ad9-0:~$ sudo apt-cache show libpmix-dev
Package: libpmix-dev
Architecture: amd64
Version: 4.1.2-2ubuntu1
Multi-Arch: same
Priority: optional
Section: universe/libdevel
Source: pmix
Origin: Ubuntu
Maintainer: Ubuntu Developers <ubuntu-devel-discuss@lists.ubuntu.com>
Original-Maintainer: Alastair McKinstry <mckinstry@debian.org>
Bugs: https://bugs.launchpad.net/ubuntu/+filebug
Installed-Size: 4640
Depends: libpmix2 (= 4.1.2-2ubuntu1), libevent-dev, libhwloc-dev, zlib1g-dev
Breaks: libpmix2 (<< 4.1.0~rc1-1)
Replaces: libpmix2 (<< 4.1.0~rc1-1)
Filename: pool/universe/p/pmix/libpmix-dev_4.1.2-2ubuntu1_amd64.deb
Size: 804768
MD5sum: 130e1af7d5265a49257425863306411e
SHA1: 9c01f290e22020e33e98e6f508513748f353a047
SHA256: 10d08bd2061da4ce86c8d4e4bf8b1b2c3355fed038c4e07cb827c7ccf37ca978
SHA512: c50bcd236adec425cfce2bbed896db72d8607e9d3e5e8fdcf45ab3d5807895472dba5c991bf6b291137b977711101bbb8be36af32e9fe99d64cf259e665edcd6
Homepage: https://github.com/pmix/pmix
Description-en: Development files for the PMI Exascale library
This is the OpenMPI implementation of the Process Management Interface (PMI)
Exascale API. PMIx aims to retain transparent compatibility with the existing
PMI-1 and PMI-2 definitions, and any future PMI releases; Support
the Instant On initiative for rapid startup of applications at exascale
and beyond.
Description-md5: 40649f4e98770885669b0326273d7233
ubuntu@juju-e60ad9-0:~$ sudo apt-cache show nfs-common
Package: nfs-common
Architecture: amd64
Version: 1:2.6.1-1ubuntu1.2
Priority: optional
Section: net
Source: nfs-utils
Origin: Ubuntu
Maintainer: Ubuntu Developers <ubuntu-devel-discuss@lists.ubuntu.com>
Original-Maintainer: Debian kernel team <debian-kernel@lists.debian.org>
Bugs: https://bugs.launchpad.net/ubuntu/+filebug
Installed-Size: 885
Provides: nfs-client
Pre-Depends: init-system-helpers (>= 1.54~)
Depends: libc6 (>= 2.34), libcap2 (>= 1:2.10), libcom-err2 (>= 1.43.9), libdevmapper1.02.1 (>= 2:1.02.97), libevent-core-2.1-7 (>= 2.1.8-stable), libgssapi-krb5-2 (>= 1.17), libkeyutils1 (>= 1.5.9), libkrb5-3 (>= 1.10+dfsg~alpha1), libmount1 (>= 2.19.1), libnfsidmap1 (= 1:2.6.1-1ubuntu1.2), libtirpc3 (>= 1.0.2), libwrap0 (>= 7.6-4~), rpcbind, adduser, ucf, lsb-base, keyutils, python3
Suggests: open-iscsi, watchdog
Conflicts: nfs-client
Replaces: nfs-client
Filename: pool/main/n/nfs-utils/nfs-common_2.6.1-1ubuntu1.2_amd64.deb
Size: 240800
MD5sum: b4360c7a36420c29746fb79e11a744de
SHA1: 0c693685905abbfc56fc496c6f72342c8cb849cd
SHA256: 100a6adfb980ffbfacf2318ecb56421b6299e3da64eb9cbd4c2f5e73aa9588c3
SHA512: 08e810cfb37a56c8b2d9606698f34b39429e3ce107f3df998eb52b7ac8c1af722f752b2428972d7cca53c2bdfe1ec9e0b0fb38a54605d42d88522bcddabf4284
Homepage: https://linux-nfs.org/
Description-en: NFS support files common to client and server
Use this package on any machine that uses NFS, either as client or
server. Programs included: lockd, statd, showmount, nfsstat, gssd,
idmapd and mount.nfs.
Description-md5: c2f5fd5a7d525f1cc35fbb49cc8628fd
Package: nfs-common
Architecture: amd64
Version: 1:2.6.1-1ubuntu1
Priority: optional
Section: net
Source: nfs-utils
Origin: Ubuntu
Maintainer: Ubuntu Developers <ubuntu-devel-discuss@lists.ubuntu.com>
Original-Maintainer: Debian kernel team <debian-kernel@lists.debian.org>
Bugs: https://bugs.launchpad.net/ubuntu/+filebug
Installed-Size: 885
Provides: nfs-client
Pre-Depends: init-system-helpers (>= 1.54~)
Depends: libc6 (>= 2.34), libcap2 (>= 1:2.10), libcom-err2 (>= 1.43.9), libdevmapper1.02.1 (>= 2:1.02.97), libevent-core-2.1-7 (>= 2.1.8-stable), libgssapi-krb5-2 (>= 1.17), libkeyutils1 (>= 1.5.9), libkrb5-3 (>= 1.10+dfsg~alpha1), libmount1 (>= 2.19.1), libnfsidmap1 (= 1:2.6.1-1ubuntu1), libtirpc3 (>= 1.0.2), libwrap0 (>= 7.6-4~), rpcbind, adduser, ucf, lsb-base, keyutils, python3
Suggests: open-iscsi, watchdog
Conflicts: nfs-client
Replaces: nfs-client
Filename: pool/main/n/nfs-utils/nfs-common_2.6.1-1ubuntu1_amd64.deb
Size: 240754
MD5sum: 0fa3f6e487197025dcebd3bee075e1a6
SHA1: 462c5868493a963d4c38ffd8cff75fbcb7fc0b30
SHA256: cb77f5befccc10d7dbd72cc609df378eacae2c75fd89ff4a1efaa8e5b8ca4a7e
SHA512: 2f90b09d36bcc47c44e39d3b5ed41f58830099e4e8e4d8768b562e813ac380b9411bb997d2e945eafd1c3b6aef9d071e829910c3648e64b49d210a19a397b9e6
Homepage: https://linux-nfs.org/
Description-en: NFS support files common to client and server
Use this package on any machine that uses NFS, either as client or
server. Programs included: lockd, statd, showmount, nfsstat, gssd,
idmapd and mount.nfs.
Description-md5: c2f5fd5a7d525f1cc35fbb49cc8628fd
Yeah, barely 1MB sounds like something we shouldn't need to worry about :)
nfs-client-operator
, and people will most likely use Lustre instead.
We also have CephFS there, but it probably won't be used as a filesystem.nfs-common
by default, but libpmix-dev
and openmpi-bin
should be there.@jedel1043 @wolsen @arif-ali thanks for sharing the community meeting notes here :smile:
Bug Description
Without
nfs-common
,libpmix-dev
, andopenmpi-bin
, users will not be able to connect nfs storage and launch mpi jobs. Since we know this is something everyone will want to do right off, I suggest automatically addingnhc-common
,libpmix-dev
, andopenmpi-bin
to theslurmd
charm andnhc-common
toslurmctld
to get the user one step closer to having a working cluster. Additionally, withoutlibpmix-dev
, slurmd will spew errors to the log saying that it can't load the pmix library, this is because the pmix libs need to be installed on the system.To Reproduce
Environment
any
Relevant log output
Additional context
na