cms-sw / cmssw-osenv

Provides wrapper scripts to provide CMSSW env using singularity
0 stars 1 forks source link

Mount additional filesystems #18

Closed lmoureaux closed 1 week ago

lmoureaux commented 3 weeks ago

cmssw-env hard-codes a list of filesystems to be mounted in the container: https://github.com/cms-sw/cmssw-osenv/blob/35500d47783eb0aa2b9ed7bd4a71e4ee0fe90612/cmssw-env#L72

(The intent seems to be to mount them only if they exist, but they are always mounted.)

This list is complete at CERN, but other sites have more. For instance at DESY, one would want to test for:

/asap3
/beegfs
/gpfs
/nfs
/pnfs

These are all mass storage filesystems that the user may want to access. While users could modify cmssw-env for their site or set MOUNT_POINTS, this requires extra work. To my knowledge, this is also undocumented.

I propose to invert the logic: instead of whitelisting a set of known mount points, mount all folders in / that are not in the images:

# Mount all filesystems in / except for things already in the image.
blacklist="bin dev etc lib lib64 mnt opt proc root run sbin singularity sys usr var"
mounts=$(comm -23 <(ls /) <(echo $blacklist | tr " " "\n" | sort))  # https://stackoverflow.com/a/11964477
for dir in $mounts; do
    MOUNT_POINTS="${MOUNT_POINTS},${dir}"
done

This would be a more general solution than maintaining a hard-coded list of mount points and would work out of the box in many cases.

cmsbuild commented 3 weeks ago

cms-bot internal usage

cmsbuild commented 3 weeks ago

A new Issue was created by @lmoureaux.

@Dr15Jones, @antoniovilela, @makortel, @mandrenguyen, @rappoccio, @sextonkennedy, @smuzaffar can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

smuzaffar commented 2 weeks ago

The intent seems to be to mount them only if they exist, but they are always mounted.

This is not true, cmssw-env only mounts if a path exists on both host and in container. I think , In old versions of singularity one can only mount if the mount path already existed in the container. So that is why we have created selected set of paths in contianer ( e.g. /hdfs /hadoop etc.) and thatis why you always see these in the container. I see that now newer versions can mount any path (even if it is not present in container.

@lmoureaux , note that cmssw-env passes all the unknown command-line arguments to apptainer, so why not use cmssw-elX -B /dir1 -B /dir2 and it should mount your selected set of paths. You can also set APPTAINER_BINDPATH env which can set the extra set of paths to mount. Or you can just create an alias cmssw-el7='cmssw-el7 -B /dir1 -B /dir2' .

I prefer that user set either APPTAINER_BINDPATH or alias via their login script instead of blindly mounting every

lmoureaux commented 2 weeks ago

The current scripts work great at CERN thanks to the bundled mount points, but setting them up elsewhere requires thinking, which creates unnecessary friction. It also means the tools have first-class support for lxplus and makes other sites feel "second class". Figuring out the mount points and command line arguments for their local cluster is currently a task that every analyzer needs to repeat on their own (usually by word-of-mouth knowledge), which is not great.

I hope you agree that CMS should as much as possible provide equal support for all sites.

Now for the technical solution, you could keep whitelisting mount points, but just for DESY this adds 5 folders and maintaining the list will be painful. The solution I proposed is an alternative to this that I expect to "just work" in most cases.

Apart from a personal preference, is there a reason for not mounting everything? It would solve a problem many people are currently facing.

smuzaffar commented 2 weeks ago

@lmoureaux , there is a reason why we had to add --ignore-mount option to cmssw-env. In some user nodes(or differnet OS) even mounting exiting set of mount points were breaking. So that is why I hesitate to mount every thing blindly.

/asap3
/beegfs
/gpfs
/nfs
/pnfs

seems safe to add but the way cmssw-env works, we have to rebuild all containers to create these paths in side container otherwise https://github.com/cms-sw/cmssw-osenv/blob/master/cmssw-env#L134-L138 logic will not mount these path

lmoureaux commented 2 weeks ago

Thanks @smuzaffar for the feedback!

I didn't know that these scripts were meant to be used with vastly different images. It seems that the original issue was #2 and especially https://github.com/cms-sw/cmssw/pull/32576. In this case the image does find /, which is a plain DoS attack against any large filesystem. It won't affect just EOS but also any other network filesystem including home folders. Preserving full backward compatibility for people using --ignore-mount a,b,c can be achieved by disabling automatic discovery when the option is present.

Regarding the issues that could make -B fail if the destination didn't exist. The code for this was added back in 2019 when sl6 (kernel 2.6) was still around as a host environment. Today the most outdated host we'll reasonably see is cc7 and bind mounts work fine there. Also, apptainer since 3.0 falls back to a different binding strategy (underlay) that doesn't require overlayfs. So I'm guessing that the workaround of creating all possible mount points in the image is actually no longer needed -- but I cannot test this without more information on failing configurations.

smuzaffar commented 2 weeks ago

thanks @lmoureaux for the details. Right, as now there are no slc6 hosts , we can remove the checking of dir existance within container. I will update the script to add the extra 5 mount (if available on host).

Yes these scripts are used for running Fedora, CentOS Stream, Cent OS, RHEL UBI images, Alma and Rocky Linux containers for x86_64, aarch64 and riscv64 architectures. In past we also had noticed that mounting /cvmfs/grid.cern.ch/etc/grid-security to /etc/grid-security also broke for certain images.