vatlab / sos

SoS workflow system for daily data analysis
http://vatlab.github.io/sos-docs
BSD 3-Clause "New" or "Revised" License
274 stars 45 forks source link

Singularity clear environment when execute containers #1521

Closed gaow closed 1 year ago

gaow commented 1 year ago

We have an issue with singularity in SoS that boils down to this:

$ singularity exec ./TensorQTL.sif micromamba run -n TensorQTL python
ERROR: ld.so: object '/share/pkg.7/gcc/7.4.0/install/lib64/libgfortran.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
critical libmamba filesystem error: temp_directory_path: No such file or directory

It happens because singularity picks some user defined environment variable specific to host but not valid in the container. To fix it, simply add -e switch:

singularity exec -e ./TensorQTL.sif micromamba run -n TensorQTL python

then it works.

@BoPeng does it make sense that we add -e to the singularity call by default?

bioworkflows commented 1 year ago

Is this something new?

✗ singularity exec -h
Run a command within a container

Usage:
  singularity exec [exec options...] <container> <command>

Description:
  singularity exec supports the following formats:

  *.sif               Singularity Image Format (SIF). Native to Singularity 3.0+

  *.sqsh              SquashFS format.  Native to Singularity 2.4+

  *.img               ext3 format. Native to Singularity versions < 2.4.

  directory/          sandbox format. Directory containing a valid root file
                      system and optionally Singularity meta-data.

  instance://*        A local running instance of a container. (See the instance
                      command group.)

  library://*         A container hosted on a Library (default
                      https://cloud.sylabs.io/library)

  docker://*          A container hosted on Docker Hub

  shub://*            A container hosted on Singularity Hub

  oras://*            A container hosted on a supporting OCI registry

Options:
  -B, --bind strings    a user-bind path specification.  spec has the
                        format src[:dest[:opts]], where src and dest are
                        outside and inside paths.  If dest is not given,
                        it is set equal to src.  Mount options ('opts')
                        may be specified as 'ro' (read-only) or 'rw'
                        (read/write, which is the default). Multiple bind
                        paths can be given by a comma separated list.
      --disable-cache   dont use cache, and dont create cache
      --docker-login    login to a Docker Repository interactively
  -h, --help            help for exec
  -H, --home string     a home directory specification.  spec can either
                        be a src path or src:dest pair.  src is the source
                        path of the home directory outside the container
                        and dest overrides the home directory within the
                        container. (default "/Users/bpeng")
      --nohttps         do NOT use HTTPS, for communicating with local
                        docker registry
      --nonet           Disable VM network handling
      --vm-cpu string   Number of CPU cores to allocate to Virtual Machine
                        (implies --vm) (default "1")
      --vm-err          enable attaching stderr from VM
      --vm-ip string    IP Address to assign for container usage. Defaults
                        to DHCP within bridge network. (default "dhcp")
      --vm-ram string   Amount of RAM in MiB to allocate to Virtual
                        Machine (implies --vm) (default "1024")

SoS supports all these options, but not the -e, which I suspect to be --environ?

gaow commented 1 year ago

Hmm interesting. We are on version 3.9.4. Here is what we have:

$ singularity exec -h
Run a command within a container

Usage:
  singularity exec [exec options...] <container> <command>

Description:
  singularity exec supports the following formats:

  *.sif               Singularity Image Format (SIF). Native to Singularity 3.0+

  *.sqsh              SquashFS format.  Native to Singularity 2.4+

  *.img               ext3 format. Native to Singularity versions < 2.4.

  directory/          sandbox format. Directory containing a valid root file 
                      system and optionally Singularity meta-data.

  instance://*        A local running instance of a container. (See the instance
                      command group.)

  library://*         A SIF container hosted on a Library
                      (default https://cloud.sylabs.io/library)

  docker://*          A Docker/OCI container hosted on Docker Hub or another
                      OCI registry.

  shub://*            A container hosted on Singularity Hub.

  oras://*            A SIF container hosted on an OCI registry that supports
                      the OCI Registry As Storage (ORAS) specification.

Options:
      --add-caps string        a comma separated capability list to add
      --allow-setuid           allow setuid binaries in container (root only)
      --app string             set an application to run inside a container
      --apply-cgroups string   apply cgroups from file for container
                               processes (root only)
  -B, --bind strings           a user-bind path specification.  spec has
                               the format src[:dest[:opts]], where src and
                               dest are outside and inside paths.  If dest
                               is not given, it is set equal to src. 
                               Mount options ('opts') may be specified as
                               'ro' (read-only) or 'rw' (read/write, which
                               is the default). Multiple bind paths can be
                               given by a comma separated list.
  -e, --cleanenv               clean environment before running container
      --compat                 apply settings for increased OCI/Docker
                               compatibility. Infers --containall,
                               --no-init, --no-umask, --writable-tmpfs.
  -c, --contain                use minimal /dev and empty other
                               directories (e.g. /tmp and $HOME) instead
                               of sharing filesystems from your host
  -C, --containall             contain not only file systems, but also
                               PID, IPC, and environment
      --disable-cache          dont use cache, and dont create cache
      --dns string             list of DNS server separated by commas to
                               add in resolv.conf
      --docker-login           login to a Docker Repository interactively
      --drop-caps string       a comma separated capability list to drop
      --env strings            pass environment variable to contained process
      --env-file string        pass environment variables from file to
                               contained process
  -f, --fakeroot               run container in new user namespace as uid 0
      --fusemount strings      A FUSE filesystem mount specification of
                               the form '<type>:<fuse command>
                               <mountpoint>' - where <type> is 'container'
                               or 'host', specifying where the mount will
                               be performed ('container-daemon' or
                               'host-daemon' will run the FUSE process
                               detached). <fuse command> is the path to
                               the FUSE executable, plus options for the
                               mount. <mountpoint> is the location in the
                               container to which the FUSE mount will be
                               attached. E.g. 'container:sshfs 10.0.0.1:/
                               /sshfs'. Implies --pid.
  -h, --help                   help for exec
  -H, --home string            a home directory specification.  spec can
                               either be a src path or src:dest pair.  src
                               is the source path of the home directory
                               outside the container and dest overrides
                               the home directory within the container.
                               (default "/home/zq2209")
      --hostname string        set container hostname
  -i, --ipc                    run container in a new IPC namespace
      --keep-privs             let root user keep privileges in container
                               (root only)
      --mount stringArray      a mount specification e.g.
                               'type=bind,source=/opt,destination=/hostopt'.
  -n, --net                    run container in a new network namespace
                               (sets up a bridge network interface by default)
      --network string         specify desired network type separated by
                               commas, each network will bring up a
                               dedicated interface inside container
                               (default "bridge")
      --network-args strings   specify network arguments to pass to CNI plugins
      --no-home                do NOT mount users home directory if /home
                               is not the current working directory
      --no-https               use http instead of https for docker://
                               oras:// and library://<hostname>/... URIs
      --no-init                do NOT start shim process with --pid
      --no-mount strings       disable one or more mount xxx options set
                               in singularity.conf
      --no-privs               drop all privileges from root user in container)
      --no-umask               do not propagate umask to the container,
                               set default 0022 umask
      --nv                     enable Nvidia support
      --nvccli                 use nvidia-container-cli for GPU setup
                               (experimental)
  -o, --overlay strings        use an overlayFS image for persistent data
                               storage or as read-only layer of container
      --passphrase             prompt for an encryption passphrase
      --pem-path string        enter an path to a PEM formatted RSA key
                               for an encrypted container
  -p, --pid                    run container in a new PID namespace
      --pwd string             initial working directory for payload
                               process inside the container
      --rocm                   enable experimental Rocm support
  -S, --scratch strings        include a scratch directory within the
                               container that is linked to a temporary dir
                               (use -W to force location)
      --security strings       enable security features (SELinux,
                               Apparmor, Seccomp)
  -u, --userns                 run container in a new user namespace,
                               allowing Singularity to run completely
                               unprivileged on recent kernels. This
                               disables some features of Singularity, for
                               example it only works with sandbox images.
      --uts                    run container in a new UTS namespace
      --vm                     enable VM support
      --vm-cpu string          number of CPU cores to allocate to Virtual
                               Machine (implies --vm) (default "1")
      --vm-err                 enable attaching stderr from VM
      --vm-ip string           IP Address to assign for container usage.
                               Defaults to DHCP within bridge network.
                               (default "dhcp")
      --vm-ram string          amount of RAM in MiB to allocate to Virtual
                               Machine (implies --vm) (default "1024")
  -W, --workdir string         working directory to be used for /tmp,
                               /var/tmp and $HOME (if -c/--contain was
                               also used)
  -w, --writable               by default all Singularity containers are
                               available as read only. This option makes
                               the file system accessible as read/write.
      --writable-tmpfs         makes the file system accessible as
                               read-write with non persistent data (with
                               overlay support only)

Examples:
  $ singularity exec /tmp/debian.sif cat /etc/debian_version
  $ singularity exec /tmp/debian.sif python ./hello_world.py
  $ cat hello_world.py | singularity exec /tmp/debian.sif python
  $ sudo singularity exec --writable /tmp/debian.sif apt-get update
  $ singularity exec instance://my_instance ps -ef
  $ singularity exec library://centos cat /etc/os-release

For additional help or support, please visit https://www.sylabs.io/docs/

Looks like things have changed quite a bit?

BoPeng commented 1 year ago

A PR should be pretty easy to create by adding the extra options to

https://github.com/vatlab/sos/blob/58d36de1c94ed40bbc6e0d6f5b19699f274a2cb4/src/sos/singularity/client.py#L332-L340

but are you sure you want to have cleanenv set to True by default? There is no back-compatibility issue but I suspect that we should follow singularity's own behavior here, and allow users to set cleanenv=True explicitly.

BoPeng commented 1 year ago

Note that an earlier version of singularity might freak out if we pass --cleanenv to it. The problem will be worse if we pass --cleanenv by default.

gaow commented 1 year ago

I suspect that we should follow singularity's own behavior here, and allow users to set cleanenv=True explicitly.

I agree. I think there might be a good reason to not clean up the environment by default. I can think of some reasons now. So, perhaps we should add the option and set it to false by default?

BoPeng commented 1 year ago

Can anyone in your group submit a PR?

BoPeng commented 1 year ago

A PR is added.