usegalaxy-eu / infrastructure-playbook

Ansible playbook for managing UseGalaxy.eu infrastructure.
MIT License
16 stars 91 forks source link

squidpy spatial tool: mount host tmp dir as containers tmp dir #1262

Closed sanjaysrikakulam closed 1 month ago

sanjaysrikakulam commented 1 month ago

I tried a few different combinations for overwriting tmp dir inside a docker container for the squidpy spatial tool. This PR is one of the combinations I tried.

The issue with the previous PR is because Galaxy is trying to mount -v "$_GALAXY_JOB_TMP_DIR:/tmp:rw" to /tmp and simultaneously through a mount argument added in the previous PR a tmpfs is being mounted in the same path as the Galaxy's. So, docker is complaining about the double/duplicate mount point; this occurs when two different sources try to compete for the same target inside the container.

To override the tmp dir for a job/tool, we need to look into the Galaxy's job script,

_galaxy_setup_environment() {
    local _use_framework_galaxy="$1"

    if [ -z "$_GALAXY_JOB_TMP_DIR" ]; then
        _GALAXY_JOB_DIR="/data/jwd02f/main/071/801/71801564"
        _GALAXY_JOB_HOME_DIR="/data/jwd02f/main/071/801/71801564/home"
        _GALAXY_JOB_TMP_DIR=$([ ! -e '/data/jwd02f/main/071/801/71801564/tmp' ] || mv '/data/jwd02f/main/071/801/71801564/tmp' '/data/jwd02f/main/071/801/71801564'/tmp.$(date +%Y%m%d-%H%M%S) ; mkdir '/data/jwd02f/main/071/801/71801564/tmp'; echo '/data/jwd02f/main/071/801/71801564/tmp')
    fi

GALAXY_MEMORY_MB="6144"; export GALAXY_MEMORY_MB
HOME="$_GALAXY_JOB_HOME_DIR"; export HOME
TMPDIR="$_GALAXY_JOB_TMP_DIR"; export TMPDIR
TMP="$_GALAXY_JOB_TMP_DIR"; export TMP
TEMP="$_GALAXY_JOB_TMP_DIR"; export TEMP

When the _GALAXY_JOB_TMP_DIR var is not set for a job or is empty, Galaxy will automatically use the tmp dir present in the JWD and the ENVs TMP, TEMP, and TMPDIR are set to use the same value in the _GALAXY_JOB_TMP_DIR.

So, in this PR, we attempt to set the hosts /tmp dir as the value for _GALAXY_JOB_TMP_DIR var, which in turn will set TMP, TEMP, and TMP_DIR to the value /tmp. The host's /tmp dir will be mounted inside the container as /tmp (see below the docker run command created by the job script).

docker run -e "GALAXY_SLOTS=$GALAXY_SLOTS" -e "GALAXY_MEMORY_MB=$GALAXY_MEMORY_MB" -e "GALAXY_MEMORY_MB_PER_SLOT=$GALAXY_MEMORY_MB_PER_SLOT" -e "HOME=$HOME" -e "_GALAXY_JOB_HOME_DIR=$_GALAXY_JOB_HOME_DIR" -e "_GALAXY_JOB_TMP_DIR=$_GALAXY_JOB_TMP_DIR" -e "TMPDIR=$TMPDIR" -e "TMP=$TMP" -e "TEMP=$TEMP" --name 2773f915e7154d9393c173049545ab4b -v "$_CONDOR_SCRATCH_DIR:$_CONDOR_SCRATCH_DIR:rw" -v "$_GALAXY_JOB_TMP_DIR:$_GALAXY_JOB_TMP_DIR:rw" -v "$_GALAXY_JOB_TMP_DIR:/tmp:rw" -v /opt/galaxy/server:/opt/galaxy/server:ro -v /opt/galaxy/shed_tools/toolshed.g2.bx.psu.edu/repos/goeckslab/squidpy/11ea000ad53f/squidpy:/opt/galaxy/shed_tools/toolshed.g2.bx.psu.edu/repos/goeckslab/squidpy/11ea000ad53f/squidpy:ro -v /data/jwd02f/main/071/801/71801564:/data/jwd02f/main/071/801/71801564:ro -v /data/jwd02f/main/071/801/71801564/outputs:/data/jwd02f/main/071/801/71801564/outputs:rw -v "$_GALAXY_JOB_HOME_DIR:$_GALAXY_JOB_HOME_DIR:rw" -v /data/jwd02f/main/071/801/71801564/working:/data/jwd02f/main/071/801/71801564/working:rw -v /opt/galaxy/datasets:/opt/galaxy/datasets:ro -v /opt/galaxy/tool-data:/opt/galaxy/tool-data:ro -v /data/db/data_managers:/data/db/data_managers:ro -v /data/dp01/galaxy_db/:/data/dp01/galaxy_db/:rw -v /data/0/galaxy_db/:/data/0/galaxy_db/:ro -v /data/1/galaxy_db/:/data/1/galaxy_db/:ro -v /data/2/galaxy_db/:/data/2/galaxy_db/:ro -v /data/3/galaxy_db/:/data/3/galaxy_db/:ro -v /data/4/galaxy_db/:/data/4/galaxy_db/:ro -v /data/6/galaxy_db/:/data/6/galaxy_db/:ro -v /data/7/galaxy_db/:/data/7/galaxy_db/:ro -v /data/dnb-ds03/galaxy_db/:/data/dnb-ds03/galaxy_db/:ro -v /data/dnb01/galaxy_db/:/data/dnb01/galaxy_db/:ro -v /data/dnb02/galaxy_db/:/data/dnb02/galaxy_db/:ro -v /data/dnb05/galaxy_db/:/data/dnb05/galaxy_db/:ro -v /data/dnb06/galaxy_db/:/data/dnb06/galaxy_db/:rw -v /data/dnb07/galaxy_db/:/data/dnb07/galaxy_db/:rw -v /data/dnb08/galaxy_db/:/data/dnb08/galaxy_db/:rw -v /data/dnb09/galaxy_db/:/data/dnb09/galaxy_db/:rw -v /data/dnb10/galaxy_db/:/data/dnb10/galaxy_db/:rw -v /data/5/galaxy_import/galaxy_user_data/:/data/5/galaxy_import/galaxy_user_data/:ro -v /data/db/:/data/db/:ro -v /cvmfs/data.galaxyproject.org:/cvmfs/data.galaxyproject.org:ro --cpus ${GALAXY_SLOTS:-1} -w /data/jwd02f/main/071/801/71801564/working --net bridge --rm --mount type=tmpfs,tmpfs-size=2147483648,destination=/tmp quay.io/biocontainers/squidpy:1.5.0 /bin/bash /data/jwd02f/main/071/801/71801564/tool_script.sh > '../outputs/tool_stdout' 2> '../outputs/tool_stderr'; return_code=$?; echo $return_code > /data/jwd02f/main/071/801/71801564/galaxy_71801564.ec;

From the docker run command (-v "$_GALAXY_JOB_TMP_DIR:$_GALAXY_JOB_TMP_DIR:rw" -v "$_GALAXY_JOB_TMP_DIR:/tmp:rw"), we can see that by setting the ENV _GALAXY_JOB_TMP_DIR: '/tmp' (as in this PR), we mount the hosts /tmp dir inside the container as /tmp.

Let's give this a try. It might solve the issue discussed here. Once this is tested on the EU, we can apply the same for the funannotate tool(s).