Open reidpr opened 5 years ago
Opinions on the enumerated options.
Option 0: This can all be accomplished as a normal user but seems cumbersome. It also might not work well with any system where step 4 is a privileged mount handled by some automatic process on the cluster (eg. the user left the squashball in some place the automatic process couldn't find it). This is essentially what the Shifter image gateway does however, so it is workable in principle.
Option 1: This is the approach I would instinctively (and perhaps naively) pursue. Can you elaborate more on why it doesn't address the Cray MPICH-style use cases that ch-fromhost already supports? From my limited understanding of how ch-fromhost handles the CrayPICH case it's not obvious to me why it's unsupported.
Option 2: This might be a reasonable approach for some large fraction of applications. In my experience, Reid's /usr/local/lib is abnormally large. On my development box /usr/local/lib is 33 files.
Option 3: Yuck? This smells bad. EDIT: Sorry, I re-read the OP again and this doesn't smell as bad. I'd be interested in seeing this demonstrated by hand to have a really good grasp of how things end up looking at ch-run time.
Option 1: [...] Can you elaborate more on why it doesn't address the Cray MPICH-style use cases that ch-fromhost already supports?
The trick is that we need to put arbitrary files of arbitrary type in arbitrary directories. In the case of Cray MPICH, there are some random files and directories scattered here and there; in the case of OpenMPI we need shared libraries in OpenMPI's directories (e.g., /usr/local/lib/openmpi
).
Option 2: This might be a reasonable approach for some large fraction of applications. In my experience, Reid's /usr/local/lib is abnormally large. On my development box /usr/local/lib is 33 files.
The bulk of it is Python modules for 3 different versions of libraries.
I believe this also gets caught up in the "arbitrary locations" problem. E.g. what if we need to inject into /opt
or /usr/local
?
Option 3: [...] I'd be interested in seeing this demonstrated by hand to have a really good grasp of how things end up looking at ch-run time.
Likewise.
Background
For various reasons, SquashFS is a viable alternative to the current recommendation of unpacking a tarball (with
ch-tar2dir
) into a tmpfs. SquashFS mounts are read-only. Issue #96 relates a different image mount approach (CVMFS) that is read-only.While
ch-run
is happy with read-only image directories (and in fact re-mounts them read-only),ch-fromhost
currently depends on modifying an unpacked image directory, which doesn't work with read-only image directories.This seems sub-optimal. Below are some options to get
ch-fromhost
-like features to inject shared libraries and other files on read-only images. We have not articulated the pros and cons at this point, merely enumerated some options.Option 0: Status quo
The current method to deal with this is, assuming SquashFS:
ch-tar2dir
ch-fromhost
mksquashfs(1)
the modified directory, creating a non-portable image archivech-run
Option 1: Add another directory to
ld.so.conf
Edit
ld.so.conf
to add a new directory, which we prepare in temporary space and bind-mount into the container at run time.This lets us add additional shared libraries to the standard paths (i.e., to be found by
ld.so
, the linker), but doesn't address use cases like MCA modules for OpenMPI (which are also.so
files) or miscellaneous files for Cray MPICH./etc/ld.so.conf
from the image directory to/tmp/ld.so.conf
./mnt/chlib
./tmp/ld.so.cache
./tmp/chlib
./tmp/chlib
./tmp/ld.so.conf
→/etc/ld.so.conf
(overmount)/tmp/ld.so.cache
→/etc/ld.so.cache
(overmount)/tmp/chlib
→/mnt/chlib
ldconfig
within the container.Option 2: Overmount directories with recursive copies
Make a recursive copy of any directories we want to inject into (e.g., in the case of OpenMPI, the first directory in
ld.so.conf
and/usr/local/lib/openmpi
) into host/tmp
. Add our files to those directories and bind-mount them in. Bind-mount in an empty, writeableld.so.cache
. Runldconfig
.Note that we do need the first directory in
ld.so.conf
because we need to be able to override shared libraries installed anywhere. The first.so
found wins.The recursive copies can be substantial. E.g.
/usr/local/lib
on Reid's development box is 5,500 files.Option 3: Overmount directories with symlink farms
Like Option 2, except instead of copying the overmounted directories, we bind-mount them to a second location within the image. In
/tmp
on the host, we create a new directory containing symlinks to the second location for all existing items, then add our new items.This is not a recursive process because we need only address the first level in the overmounted directory.