Closed lmoureaux closed 2 months ago
cms-bot internal usage
A new Issue was created by @lmoureaux.
@Dr15Jones, @antoniovilela, @makortel, @mandrenguyen, @rappoccio, @sextonkennedy, @smuzaffar can you please review it and eventually sign/assign? Thanks.
cms-bot commands are listed here
The intent seems to be to mount them only if they exist, but they are always mounted.
This is not true, cmssw-env
only mounts if a path exists on both host and in container. I think , In old versions of singularity one can only mount if the mount path already existed in the container. So that is why we have created selected set of paths in contianer ( e.g. /hdfs /hadoop etc.) and thatis why you always see these in the container. I see that now newer versions can mount any path (even if it is not present in container.
@lmoureaux , note that cmssw-env
passes all the unknown command-line arguments to apptainer
, so why not use cmssw-elX -B /dir1 -B /dir2
and it should mount your selected set of paths. You can also set APPTAINER_BINDPATH
env which can set the extra set of paths to mount. Or you can just create an alias cmssw-el7='cmssw-el7 -B /dir1 -B /dir2'
.
I prefer that user set either APPTAINER_BINDPATH
or alias
via their login script instead of blindly mounting every
The current scripts work great at CERN thanks to the bundled mount points, but setting them up elsewhere requires thinking, which creates unnecessary friction. It also means the tools have first-class support for lxplus
and makes other sites feel "second class". Figuring out the mount points and command line arguments for their local cluster is currently a task that every analyzer needs to repeat on their own (usually by word-of-mouth knowledge), which is not great.
I hope you agree that CMS should as much as possible provide equal support for all sites.
Now for the technical solution, you could keep whitelisting mount points, but just for DESY this adds 5 folders and maintaining the list will be painful. The solution I proposed is an alternative to this that I expect to "just work" in most cases.
Apart from a personal preference, is there a reason for not mounting everything? It would solve a problem many people are currently facing.
@lmoureaux , there is a reason why we had to add --ignore-mount
option to cmssw-env
. In some user nodes(or differnet OS) even mounting exiting set of mount points were breaking. So that is why I hesitate to mount every thing blindly.
/asap3
/beegfs
/gpfs
/nfs
/pnfs
seems safe to add but the way cmssw-env
works, we have to rebuild all containers to create these paths in side container otherwise https://github.com/cms-sw/cmssw-osenv/blob/master/cmssw-env#L134-L138 logic will not mount these path
Thanks @smuzaffar for the feedback!
I didn't know that these scripts were meant to be used with vastly different images. It seems that the original issue was #2 and especially https://github.com/cms-sw/cmssw/pull/32576. In this case the image does find /
, which is a plain DoS attack against any large filesystem. It won't affect just EOS but also any other network filesystem including home folders. Preserving full backward compatibility for people using --ignore-mount a,b,c
can be achieved by disabling automatic discovery when the option is present.
Regarding the issues that could make -B
fail if the destination didn't exist. The code for this was added back in 2019 when sl6 (kernel 2.6) was still around as a host environment. Today the most outdated host we'll reasonably see is cc7 and bind mounts work fine there. Also, apptainer
since 3.0 falls back to a different binding strategy (underlay
) that doesn't require overlayfs
. So I'm guessing that the workaround of creating all possible mount points in the image is actually no longer needed -- but I cannot test this without more information on failing configurations.
thanks @lmoureaux for the details. Right, as now there are no slc6 hosts , we can remove the checking of dir existance within container. I will update the script to add the extra 5 mount (if available on host).
Yes these scripts are used for running Fedora, CentOS Stream, Cent OS, RHEL UBI images, Alma and Rocky Linux containers for x86_64, aarch64 and riscv64 architectures. In past we also had noticed that mounting /cvmfs/grid.cern.ch/etc/grid-security
to /etc/grid-security
also broke for certain images.
cmssw-env
hard-codes a list of filesystems to be mounted in the container: https://github.com/cms-sw/cmssw-osenv/blob/35500d47783eb0aa2b9ed7bd4a71e4ee0fe90612/cmssw-env#L72(The intent seems to be to mount them only if they exist, but they are always mounted.)
This list is complete at CERN, but other sites have more. For instance at DESY, one would want to test for:
These are all mass storage filesystems that the user may want to access. While users could modify
cmssw-env
for their site or setMOUNT_POINTS
, this requires extra work. To my knowledge, this is also undocumented.I propose to invert the logic: instead of whitelisting a set of known mount points, mount all folders in
/
that are not in the images:This would be a more general solution than maintaining a hard-coded list of mount points and would work out of the box in many cases.