PGScatalog / pgsc_calc

The Polygenic Score Catalog Calculator is a nextflow pipeline for polygenic score calculation
https://pgsc-calc.readthedocs.io/en/latest/
Apache License 2.0
113 stars 21 forks source link

Docker inside of Docker: /bin/bash: .command.run: No such file or directory #242

Closed Fiwx closed 7 months ago

Fiwx commented 8 months ago

Description of the bug

I am going to use pgsc_calc in a study which will use HPC resources. This requires Dockerizing the pipeline I'm using for the study. The pipeline includes a bunch of pre- and post-processing and imputation, and of course pgsc_calc for calculating the polygenic scores.

I am getting a "/bin/bash: .command.run: No such file or directory" error, even though .command.run exists.

Extra information:

First, I ssh into my Ubuntu image:

ssh -i "docker2.pem" ubuntu@ec2.compute.amazonaws.com

Next, I enter the image that I previously built:

sudo docker run -it mydocker:latest /bin/bash

Inside of Docker (which uses an Ubuntu base):

nextflow -v         → nextflow version 23.10.1.5891
docker --version    → bash: docker: command not found
cat /etc/os-release → PRETTY_NAME="Ubuntu 22.04.3 LTS"

Installing Docker within my Docker:

curl -fsSL https://download.docker.com/linux/ubuntu/gpg | gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg

echo "deb [arch=amd64 signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | tee /etc/apt/sources.list.d/docker.list > /dev/null

apt-get update 

apt-get install -y docker-ce

Docker is installed: docker --version → Docker version 25.0.3, build 4debf41

However, running nextflow run pgscatalog/pgsc_calc -profile test,docker at this stage results in:

Command error:
  docker: "specify container image platform" requires API version 1.41, but the Docker daemon API version is 1.24.
  See 'docker run --help'.

We can fix this, by doing this: Starting my (outer) Docker:

sudo docker run -it -v /var/run/docker.sock:/var/run/docker.sock mydocker:latest /bin/bash

Within outer Docker:

apt-get update
apt-get install -y apt-transport-https ca-certificates curl gnupg lsb-release
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
echo "deb [arch=amd64 signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | tee /etc/apt/sources.list.d/docker.list > /dev/null
apt-get update
apt-get install -y docker-ce

Now: docker version --format '{{.Server.APIVersion}}' → 1.44

Attempting to run nextflow run pgscatalog/pgsc_calc -profile test,docker now results in:

Command error:
  Unable to find image 'ghcr.io/pgscatalog/pgscatalog_utils:v0.4.3' locally
  v0.4.3: Pulling from pgscatalog/pgscatalog_utils
  01b5b2efb836: Pulling fs layer
[more of these…]
  03ac3c5db7fe: Pulling fs layer
  d1465593fd9b: Waiting
[more of these…]
  03ac3c5db7fe: Waiting
  5dda314a937a: Download complete
  01b5b2efb836: Download complete
[more of these…]
  23a02a884d53: Verifying Checksum
[more of these…]
  03ac3c5db7fe: Download complete
  5dda314a937a: Pull complete
[more of these…]
  03ac3c5db7fe: Pull complete
  Digest: sha256:a86d0f66f474a25278c68ba09a6d30c5cef8fbae498893bc1e59b19c613a8799
  Status: Downloaded newer image for ghcr.io/pgscatalog/pgscatalog_utils:v0.4.3
  /bin/bash: .command.run: No such file or directory

Then the 2nd time it runs, we simply get:

Command error:
  /bin/bash: .command.run: No such file or directory

But, there is a .command.run in the work directory. The file /bin/bash: /exact/specific/path/to/work/.command.run looks normal and has the script within it.

.command.log says:

/bin/bash: /exact/specific/path/to/work/.command.run: No such file or directory

Let’s start over and open up the outer container and do things right the first time.

sudo docker run -it -v /var/run/docker.sock:/var/run/docker.sock mydocker:latest /bin/bash

Now, inside the Docker:

apt-get update
apt-get install -y apt-transport-https ca-certificates curl gnupg lsb-release
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
echo "deb [arch=amd64 signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | tee /etc/apt/sources.list.d/docker.list > /dev/null
apt-get update
apt-get install -y docker-ce
nextflow run pgscatalog/pgsc_calc -profile test,docker

We still get this error:

Command error:
  /bin/bash: .command.run: No such file or directory

Work dir:
  /home/ubuntu/workdir/fc

Inside /home/ubuntu/workdir/fc, ls -a shows the following:

. .. .command.begin .command.err .command.log .command.out .command.run .command.sh .exitcode NO_FILE PGS001229_22.txt

There are two .command.run files. /home/ubuntu/workdir/fc/.command.run and also /home/ubuntu/workdir/23/.command.run.

The contents of /home/ubuntu/workdir/fc/.command.run are included.

To get more information, I ran NXF_DEBUG=2 nextflow run pgscatalog/pgsc_calc -profile test,docker, which gave this output:

Command error:
  NXF_ORG=nextflow-io
  NXF_PACK=one
  PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
  PWD=/home/ubuntu/workdir/work/work/37/de
  PYTHONPATH=:/usr/local/lib/python3.10/site-packages
[…]
Command error:
  NXF_ORG=nextflow-io
  NXF_PACK=one
  PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
  PWD=/home/ubuntu/workdir/work/work/37/de
  PYTHONPATH=:/usr/local/lib/python3.10/site-packages
  PYTHON_VERSION=3.10.12
  SHLVL=1
  TERM=xterm
  _=/usr/bin/env
  + echo '============= task output =================='
  ============= task output ==================
  + touch .command.begin
  + set +u
  + set -u
  + [[ -n '' ]]
  + export NXF_TASK_WORKDIR=/home/ubuntu/workdir/work/work/37/de
  + NXF_TASK_WORKDIR=/home/ubuntu/workdir/work/work/37/de
  + nxf_stage
  + true
  + rm -f samplesheet.csv
  + ln -s /root/.nextflow/assets/pgscatalog/pgsc_calc/assets/examples/samplesheet.csv samplesheet.csv
  + set +e
  + pid=1361
  + wait 1361
  + set -o pipefail
  + tee .command.err
  + tee .command.out
  + nxf_launch
  ++ id -u
  ++ id -u
  ++ id -g
  ++ nxf_container_env
  ++ cat
  + docker run -i --cpu-shares 2048 --memory 6144m -e NXF_TASK_WORKDIR -e NXF_DEBUG=2 -u 0 -e HOME=/root -v /etc/passwd:/etc/passwd:ro -v /etc/shadow:/etc/shadow:ro -v /etc/group:/etc/group:ro -v /root:/root -v /root/.nextflow/assets/pgscatalog/pgsc_calc:/root/.nextflow/assets/pgscatalog/pgsc_calc -v /home/ubuntu/workdir/work/work/37/de:/home/ubuntu/workdir/work/work/37/de -w /home/ubuntu/workdir/work/work/37/de -u 0:0 --platform linux/amd64 --name nxf-Il730CqxL0M9xindBpMmk7ER ghcr.io/pgscatalog/pgscatalog_utils:v0.4.3 /bin/bash -c 'eval export PYTHONNOUSERSITE="1"
  export R_PROFILE_USER="/.Rprofile"
  export R_ENVIRON_USER="/.Renviron"
  export JULIA_DEPOT_PATH="/usr/local/share/julia"
  export PATH="$PATH:/root/.nextflow/assets/pgscatalog/pgsc_calc/bin"; /bin/bash .command.run nxf_trace'
  /bin/bash: .command.run: No such file or directory
  + nxf_main_ret=127
  + nxf_unstage
  + true
  + [[ 127 != 0 ]]
  + return
  + on_exit
  + exit_status=127
  + printf -- 127
  + set +u
  + docker rm nxf-Il730CqxL0M9xindBpMmk7ER
  + exit 127

Work dir:
  /home/ubuntu/workdir/work/work/37/de

Tip: when you have fixed the problem you can continue the execution adding the option `-resume` to the run command line

 -- Check '.nextflow.log' file for details
ERROR ~ ERROR: Matching subworkflow failed

 -- Check '.nextflow.log' file for details

I ran: root@72f41fdf1937:/home/ubuntu/workdir/work/work/37/de# ./.command.run Resulting in:

bash: ./.command.run: Permission denied

ls -l .command.run-rw-r--r-- 1 root root 10654 Feb 12 17:49 .command.run

I ran chmod u+x .command.run and ./.command.run again, in the /home/ubuntu/workdir/work/work/37/de directory, resulting in the error: /bin/bash: /home/ubuntu/workdir/work/work/37/de/.command.run: No such file or directory

/home/ubuntu/workdir/work/work/37/de/.command.run certainly 100% exists in this exact location. (Also, if I try to run something on my system that doesn’t exist, it just says “bash” not “/bin/bash”).

We've now confirmed:

.command.sh and .command.run seem to be both in their original, correct directories.

Command used and terminal output

No response

Relevant files

!/bin/bash

NEXTFLOW TASK: PGSCATALOG_PGSCALC:PGSCALC:INPUT_CHECK:COMBINE_SCOREFILES (1)

set -e set -u NXF_DEBUG=${NXF_DEBUG:=0}; [[ $NXF_DEBUG > 1 ]] && set -x NXF_ENTRY=${1:-nxf_main}

nxf_tree() { local pid=$1

declare -a ALL_CHILDREN
while read P PP;do
    ALL_CHILDREN[$PP]+=" $P"
done < <(ps -e -o pid= -o ppid=)

pstat() {
    local x_pid=$1
    local STATUS=$(2> /dev/null < /proc/$1/status grep -E 'Vm|ctxt')

    if [ $? = 0 ]; then
    local  x_vsz=$(echo "$STATUS" | grep VmSize | awk '{print $2}' || echo -n '0')
    local  x_rss=$(echo "$STATUS" | grep VmRSS | awk '{print $2}' || echo -n '0')
    local x_peak=$(echo "$STATUS" | grep -E 'VmPeak|VmHWM' | sed 's/^.*:\s*//' | sed 's/[\sa-zA-Z]*$//' | tr '\n' ' ' || echo -n '0 0')
    local x_pmem=$(awk -v rss=$x_rss -v mem_tot=$mem_tot 'BEGIN {printf "%.0f", rss/mem_tot*100*10}' || echo -n '0')
    local vol_ctxt=$(echo "$STATUS" | grep '\bvoluntary_ctxt_switches' | awk '{print $2}' || echo -n '0')
    local inv_ctxt=$(echo "$STATUS" | grep '\bnonvoluntary_ctxt_switches' | awk '{print $2}' || echo -n '0')
    cpu_stat[x_pid]="$x_pid $x_pmem $x_vsz $x_rss $x_peak $vol_ctxt $inv_ctxt"
    fi
}

pwalk() {
    pstat $1
    for i in ${ALL_CHILDREN[$1]:=}; do pwalk $i; done
}

pwalk $1

}

nxf_stat() { cpu_stat=() nxf_tree $1

declare -a sum=(0 0 0 0 0 0 0 0)
local pid
local i
for pid in "${!cpu_stat[@]}"; do
    local row=(${cpu_stat[pid]})
    [ $NXF_DEBUG = 1 ] && echo "++ stat mem=${row[*]}"
    for i in "${!row[@]}"; do
    if [ $i != 0 ]; then
        sum[i]=$((sum[i]+row[i]))
    fi
    done
done

[ $NXF_DEBUG = 1 ] && echo -e "++ stat SUM=${sum[*]}"

for i in {1..7}; do
    if [ ${sum[i]} -lt ${cpu_peak[i]} ]; then
        sum[i]=${cpu_peak[i]}
    else
        cpu_peak[i]=${sum[i]}
    fi
done

[ $NXF_DEBUG = 1 ] && echo -e "++ stat PEAK=${sum[*]}\n"
nxf_stat_ret=(${sum[*]})

}

nxf_mem_watch() { set -o pipefail local pid=$1 local trace_file=.command.trace local count=0; declare -a cpu_stat=(0 0 0 0 0 0 0 0) declare -a cpu_peak=(0 0 0 0 0 0 0 0) local mem_tot=$(< /proc/meminfo grep MemTotal | awk '{print $2}') local timeout local DONE local STOP=''

[ $NXF_DEBUG = 1 ] && nxf_sleep 0.2 && ps fx

while true; do
    nxf_stat $pid
    if [ $count -lt 10 ]; then timeout=1;
    elif [ $count -lt 120 ]; then timeout=5;
    else timeout=30;
    fi
    read -t $timeout -r DONE || true
    [[ $DONE ]] && break
    if [ ! -e /proc/$pid ]; then
        [ ! $STOP ] && STOP=$(nxf_date)
        [ $(($(nxf_date)-STOP)) -gt 10000 ] && break
    fi
    count=$((count+1))
done

echo "%mem=${nxf_stat_ret[1]}"      >> $trace_file
echo "vmem=${nxf_stat_ret[2]}"      >> $trace_file
echo "rss=${nxf_stat_ret[3]}"       >> $trace_file
echo "peak_vmem=${nxf_stat_ret[4]}" >> $trace_file
echo "peak_rss=${nxf_stat_ret[5]}"  >> $trace_file
echo "vol_ctxt=${nxf_stat_ret[6]}"  >> $trace_file
echo "inv_ctxt=${nxf_stat_ret[7]}"  >> $trace_file

}

nxf_write_trace() { echo "nextflow.trace/v2" > $trace_file echo "realtime=$wall_time" >> $trace_file echo "%cpu=$ucpu" >> $trace_file echo "cpu_model=$cpu_model" >> $trace_file echo "rchar=${io_stat1[0]}" >> $trace_file echo "wchar=${io_stat1[1]}" >> $trace_file echo "syscr=${io_stat1[2]}" >> $trace_file echo "syscw=${io_stat1[3]}" >> $trace_file echo "read_bytes=${io_stat1[4]}" >> $trace_file echo "write_bytes=${io_stat1[5]}" >> $trace_file }

nxf_trace_mac() { local start_millis=$(nxf_date)

/bin/bash -euo pipefail /home/ubuntu/workdir/fc.command.sh

local end_millis=$(nxf_date)
local wall_time=$((end_millis-start_millis))
local ucpu=''
local cpu_model=''
local io_stat1=('' '' '' '' '' '')
nxf_write_trace

}

nxf_fd() { local FD=11 while [ -e /proc/$$/fd/$FD ]; do FD=$((FD+1)); done echo $FD }

nxf_trace_linux() { local pid=$$ command -v ps &>/dev/null || { >&2 echo "Command 'ps' required by nextflow to collect task metrics cannot be found"; exit 1; } local num_cpus=$(< /proc/cpuinfo grep '^processor' -c) local cpu_model=$(< /proc/cpuinfo grep '^model name' | head -n 1 | awk 'BEGIN{FS="\t: "} { print $2 }') local tot_time0=$(grep '^cpu ' /proc/stat | awk '{sum=$2+$3+$4+$5+$6+$7+$8+$9; printf "%.0f",sum}') local cpu_time0=$(2> /dev/null < /proc/$pid/stat awk '{printf "%.0f", ($16+$17)10 }' || echo -n 'X') local io_stat0=($(2> /dev/null < /proc/$pid/io sed 's/^.:\s*//' | head -n 6 | tr '\n' ' ' || echo -n '0 0 0 0 0 0')) local start_millis=$(nxf_date) trap 'kill $mem_proc' ERR

/bin/bash -euo pipefail /home/ubuntu/workdir/fc.command.sh &
local task=$!

mem_fd=$(nxf_fd)
eval "exec $mem_fd> >(nxf_mem_watch $task)"
local mem_proc=$!

wait $task

local end_millis=$(nxf_date)
local tot_time1=$(grep '^cpu ' /proc/stat | awk '{sum=$2+$3+$4+$5+$6+$7+$8+$9; printf "%.0f",sum}')
local cpu_time1=$(2> /dev/null < /proc/$pid/stat awk '{printf "%.0f", ($16+$17)*10 }' || echo -n 'X')
local ucpu=$(awk -v p1=$cpu_time1 -v p0=$cpu_time0 -v t1=$tot_time1 -v t0=$tot_time0 -v n=$num_cpus 'BEGIN { pct=(p1-p0)/(t1-t0)*100*n; printf("%.0f", pct>0 ? pct : 0) }' )

local io_stat1=($(2> /dev/null < /proc/$pid/io sed 's/^.*:\s*//' | head -n 6 | tr '\n' ' ' || echo -n '0 0 0 0 0 0'))
local i
for i in {0..5}; do
    io_stat1[i]=$((io_stat1[i]-io_stat0[i]))
done

local wall_time=$((end_millis-start_millis))
[ $NXF_DEBUG = 1 ] && echo "+++ STATS %CPU=$ucpu TIME=$wall_time I/O=${io_stat1[*]}"

echo "nextflow.trace/v2"           > $trace_file
echo "realtime=$wall_time"         >> $trace_file
echo "%cpu=$ucpu"                  >> $trace_file
echo "cpu_model=$cpu_model"        >> $trace_file
echo "rchar=${io_stat1[0]}"        >> $trace_file
echo "wchar=${io_stat1[1]}"        >> $trace_file
echo "syscr=${io_stat1[2]}"        >> $trace_file
echo "syscw=${io_stat1[3]}"        >> $trace_file
echo "read_bytes=${io_stat1[4]}"   >> $trace_file
echo "write_bytes=${io_stat1[5]}"  >> $trace_file

[ -e /proc/$mem_proc ] && eval "echo 'DONE' >&$mem_fd" || true
wait $mem_proc 2>/dev/null || true
while [ -e /proc/$mem_proc ]; do nxf_sleep 0.1; done

}

nxf_trace() { local trace_file=.command.trace touch $trace_file if [[ $(uname) = Darwin ]]; then nxf_trace_mac else nxf_trace_linux fi } nxf_container_env() { cat << EOF export PYTHONNOUSERSITE="1" export R_PROFILE_USER="/.Rprofile" export R_ENVIRON_USER="/.Renviron" export JULIA_DEPOT_PATH="/usr/local/share/julia" export PATH="\$PATH:/root/.nextflow/assets/pgscatalog/pgsc_calc/bin" EOF }

nxf_sleep() { sleep $1 2>/dev/null || sleep 1; }

nxf_date() { local ts=$(date +%s%3N); if [[ ${#ts} == 10 ]]; then echo ${ts}000 elif [[ $ts == %3N ]]; then echo ${ts/\%3N/000} elif [[ $ts == 3N ]]; then echo ${ts/3N/000} elif [[ ${#ts} == 13 ]]; then echo $ts else echo "Unexpected timestamp value: $ts"; exit 1 fi }

nxf_env() { echo '============= task environment =============' env | sort | sed "s/(.)AWS(.)=(.{6}).*/\1AWS\2=\3xxxxxxxxxxxxx/" echo '============= task output ==================' }

nxf_kill() { declare -a children while read P PP;do children[$PP]+=" $P" done < <(ps -e -o pid= -o ppid=)

kill_all() {
    [[ $1 != $$ ]] && kill $1 2>/dev/null || true
    for i in ${children[$1]:=}; do kill_all $i; done
}

kill_all $1

}

nxf_mktemp() { local base=${1:-/tmp} mkdir -p "$base" if [[ $(uname) = Darwin ]]; then mktemp -d $base/nxf.XXXXXXXXXX else TMPDIR="$base" mktemp -d -t nxf.XXXXXXXXXX fi }

nxf_fs_copy() { local source=$1 local target=$2 local basedir=$(dirname $1) mkdir -p $target/$basedir cp -fRL $source $target/$basedir }

nxf_fs_move() { local source=$1 local target=$2 local basedir=$(dirname $1) mkdir -p $target/$basedir mv -f $source $target/$basedir }

nxf_fs_rsync() { rsync -rRl $1 $2 }

nxf_fs_rclone() { rclone copyto $1 $2/$1 }

nxf_fs_fcp() { fcp $1 $2/$1 }

on_exit() { exit_status=${nxf_main_ret:=$?} printf -- $exit_status > /home/ubuntu/workdir/fc.exitcode set +u docker rm $NXF_BOXID &>/dev/null || true exit $exit_status }

on_term() { set +e docker stop $NXF_BOXID }

nxf_launch() { docker run -i --cpu-shares 2048 --memory 6144m -e "NXF_TASK_WORKDIR" -e "NXF_DEBUG=${NXF_DEBUG:=0}" -u $(id -u) -e "HOME=${HOME}" -v /etc/passwd:/etc/passwd:ro -v /etc/shadow:/etc/shadow:ro -v /etc/group:/etc/group:ro -v $HOME:$HOME -v /root/.nextflow/assets/pgscatalog/pgsc_calc:/root/.nextflow/assets/pgscatalog/pgsc_calc -v /home/ubuntu/workdir:/home/ubuntu/workdir -w "$PWD" -u $(id -u):$(id -g) --platform linux/amd64 --name $NXF_BOXID ghcr.io/pgscatalog/pgscatalog_utils:v0.4.3 /bin/bash -c "eval $(nxf_container_env); /bin/bash /home/ubuntu/workdir/fc.command.run nxf_trace" }

nxf_stage() { true

stage input files

rm -f PGS001229_22.txt
rm -f NO_FILE
ln -s /root/.nextflow/assets/pgscatalog/pgsc_calc/assets/examples/scorefiles/PGS001229_22.txt PGS001229_22.txt
ln -s /home/ubuntu/workdir/NO_FILE NO_FILE

}

nxf_unstage() { true [[ ${nxf_main_ret:=0} != 0 ]] && return }

nxf_main() { trap on_exit EXIT trap on_term TERM INT USR2 trap '' USR1

[[ "${NXF_CHDIR:-}" ]] && cd "$NXF_CHDIR"
export NXF_BOXID="nxf-$(dd bs=18 count=1 if=/dev/urandom 2>/dev/null | base64 | tr +/ 0A | tr -d '\r\n')"
NXF_SCRATCH=''
[[ $NXF_DEBUG > 0 ]] && nxf_env
touch /home/ubuntu/workdir/fc.command.begin
set +u
set -u
[[ $NXF_SCRATCH ]] && cd $NXF_SCRATCH
export NXF_TASK_WORKDIR="$PWD"
nxf_stage

set +e
(set -o pipefail; (nxf_launch | tee .command.out) 3>&1 1>&2 2>&3 | tee .command.err) &
pid=$!
wait $pid || nxf_main_ret=$?
nxf_unstage

}

$NXF_ENTRY

System information

Container engine: /usr/bin/docker OS: Ubuntu 22.04.3 LTS

nebfield commented 8 months ago

Running docker in docker is experimental in nextflow. Getting docker in docker working in general can be quite tricky. I've never tried to get it working myself, so I can't offer much advice, sorry. Can you run singularity on the HPC? That normally solves most of these types of problems.

If you have to use docker I'd recommend installing all of the dependencies inside one big container, including:

Version information for each dependency is available here.

and launch the workflow without using an execution profile (i.e. omit -profile).

Fiwx commented 8 months ago

Thanks for the advice! I'll try without profile, and try using Singularity inside Docker instead of Docker inside Docker. Interestingly though, the error doesn't seem to be related to Docker in an obvious way.

Trying without using profile yielded this error, with no failure to find command.run. Perhaps the error involves Python? I got this before, but I didn't include it.

Command executed:

  samplesheet_to_json samplesheeet.csv out.json

  cat <<-END_VERSIONS > versions.yml
  SAMPLESHEET_JSON:
      python: $(echo $(python --version 2>&1) | cut -f 2 -d ' ')
  END_VERSIONS

Command exit status:
  127

Command output:
  (empty)

Here is the command I ran:

nextflow run /home/ubuntu/dir/tools/pgsc_calc/main.nf -profile docker --input /home/ubuntu/dir/run/test/samplesheeet.csv --pgs_id PGS003724 --target_build GRCh37 --min_overlap 0.0 --run_ancestry /home/ubuntu/dir/data/pgsc_1000G_v1.tar.zst -c /home/ubuntu/dir/references/custom.config -with-docker docker://PGScatalog/pgsc_calc

Running this without -with-docker docker://PGScatalog/pgsc_calc yields the same error.

Fiwx commented 8 months ago

I tried Singularity. It got a lot further. Before, it wasn't doing any of the checkboxes.

Here is the result of running the test profile with Singularity instead of Docker:

------------------------------------------------------
[-        ] process > PGSCATALOG_PGSCALC:PGSCALC:INPUT_CHECK:SAMPLESHEET_JSON       -
[-        ] process > PGSCATALOG_PGSCALC:PGSCALC:INPUT_CHECK:COMBINE_SCOREFILES     -
executor >  local (1)
[-        ] process > PGSCATALOG_PGSCALC:PGSCALC:INPUT_CHECK:SAMPLESHEET_JSON       [  0%] 0 of 1
executor >  local (2)
[ee/47fb5e] process > PGSCATALOG_PGSCALC:PGSCALC:INPUT_CHECK:SAMPLESHEET_JSON (samplesheet.csv) [  0%] 0 of 1
executor >  local (2)
[ee/47fb5e] process > PGSCATALOG_PGSCALC:PGSCALC:INPUT_CHECK:SAMPLESHEET_JSON (samplesheet.csv) [100%] 1 of 1 ✔
[b5/b4c2fd] process > PGSCATALOG_PGSCALC:PGSCALC:INPUT_CHECK:COMBINE_SCOREFILES (1)             [100%] 1 of 1 ✔
executor >  local (3)
[ee/47fb5e] process > PGSCATALOG_PGSCALC:PGSCALC:INPUT_CHECK:SAMPLESHEET_JSON (samplesheet.csv)            [100%] 1 of 1 ✔
[b5/b4c2fd] process > PGSCATALOG_PGSCALC:PGSCALC:INPUT_CHECK:COMBINE_SCOREFILES (1)                        [100%] 1 of 1 ✔
executor >  local (3)
[ee/47fb5e] process > PGSCATALOG_PGSCALC:PGSCALC:INPUT_CHECK:SAMPLESHEET_JSON (samplesheet.csv)            [100%] 1 of 1 ✔
[b5/b4c2fd] process > PGSCATALOG_PGSCALC:PGSCALC:INPUT_CHECK:COMBINE_SCOREFILES (1)                        [100%] 1 of 1 ✔
executor >  local (4)
[ee/47fb5e] process > PGSCATALOG_PGSCALC:PGSCALC:INPUT_CHECK:SAMPLESHEET_JSON (samplesheet.csv)            [100%] 1 of 1 ✔
[b5/b4c2fd] process > PGSCATALOG_PGSCALC:PGSCALC:INPUT_CHECK:COMBINE_SCOREFILES (1)                        [100%] 1 of 1 ✔
executor >  local (5)
[ee/47fb5e] process > PGSCATALOG_PGSCALC:PGSCALC:INPUT_CHECK:SAMPLESHEET_JSON (samplesheet.csv)            [100%] 1 of 1 ✔
[b5/b4c2fd] process > PGSCATALOG_PGSCALC:PGSCALC:INPUT_CHECK:COMBINE_SCOREFILES (1)                        [100%] 1 of 1 ✔
executor >  local (5)
[ee/47fb5e] process > PGSCATALOG_PGSCALC:PGSCALC:INPUT_CHECK:SAMPLESHEET_JSON (samplesheet.csv)            [100%] 1 of 1 ✔
[b5/b4c2fd] process > PGSCATALOG_PGSCALC:PGSCALC:INPUT_CHECK:COMBINE_SCOREFILES (1)                        [100%] 1 of 1 ✔
executor >  local (5)
[ee/47fb5e] process > PGSCATALOG_PGSCALC:PGSCALC:INPUT_CHECK:SAMPLESHEET_JSON (samplesheet.csv)            [100%] 1 of 1 ✔
[b5/b4c2fd] process > PGSCATALOG_PGSCALC:PGSCALC:INPUT_CHECK:COMBINE_SCOREFILES (1)                        [100%] 1 of 1 ✔
executor >  local (6)
[ee/47fb5e] process > PGSCATALOG_PGSCALC:PGSCALC:INPUT_CHECK:SAMPLESHEET_JSON (samplesheet.csv)                         [100%] 1 of 1 ✔
[b5/b4c2fd] process > PGSCATALOG_PGSCALC:PGSCALC:INPUT_CHECK:COMBINE_SCOREFILES (1)                                     [100%] 1 of 1 ✔
executor >  local (7)
[ee/47fb5e] process > PGSCATALOG_PGSCALC:PGSCALC:INPUT_CHECK:SAMPLESHEET_JSON (samplesheet.csv)                         [100%] 1 of 1 ✔
[b5/b4c2fd] process > PGSCATALOG_PGSCALC:PGSCALC:INPUT_CHECK:COMBINE_SCOREFILES (1)                                     [100%] 1 of 1 ✔
executor >  local (7)
[ee/47fb5e] process > PGSCATALOG_PGSCALC:PGSCALC:INPUT_CHECK:SAMPLESHEET_JSON (samplesheet.csv)                         [100%] 1 of 1 ✔
[b5/b4c2fd] process > PGSCATALOG_PGSCALC:PGSCALC:INPUT_CHECK:COMBINE_SCOREFILES (1)                                     [100%] 1 of 1 ✔
[-        ] process > PGSCATALOG_PGSCALC:PGSCALC:MAKE_COMPATIBLE:PLINK2_RELABELBIM                                      -
[22/e7b529] process > PGSCATALOG_PGSCALC:PGSCALC:MAKE_COMPATIBLE:PLINK2_RELABELPVAR (cineca chromosome 22)              [100%] 1 of 1 ✔
executor >  local (8)
[ee/47fb5e] process > PGSCATALOG_PGSCALC:PGSCALC:INPUT_CHECK:SAMPLESHEET_JSON (samplesheet.csv)                         [100%] 1 of 1 ✔
[b5/b4c2fd] process > PGSCATALOG_PGSCALC:PGSCALC:INPUT_CHECK:COMBINE_SCOREFILES (1)                                     [100%] 1 of 1 ✔
[-        ] process > PGSCATALOG_PGSCALC:PGSCALC:MAKE_COMPATIBLE:PLINK2_RELABELBIM                                      -
[22/e7b529] process > PGSCATALOG_PGSCALC:PGSCALC:MAKE_COMPATIBLE:PLINK2_RELABELPVAR (cineca chromosome 22)              [100%] 1 of 1 ✔
executor >  local (8)
[ee/47fb5e] process > PGSCATALOG_PGSCALC:PGSCALC:INPUT_CHECK:SAMPLESHEET_JSON (samplesheet.csv)                         [100%] 1 of 1 ✔
[b5/b4c2fd] process > PGSCATALOG_PGSCALC:PGSCALC:INPUT_CHECK:COMBINE_SCOREFILES (1)                                     [100%] 1 of 1 ✔
[-        ] process > PGSCATALOG_PGSCALC:PGSCALC:MAKE_COMPATIBLE:PLINK2_RELABELBIM                                      -
[22/e7b529] process > PGSCATALOG_PGSCALC:PGSCALC:MAKE_COMPATIBLE:PLINK2_RELABELPVAR (cineca chromosome 22)              [100%] 1 of 1 ✔
executor >  local (9)
[ee/47fb5e] process > PGSCATALOG_PGSCALC:PGSCALC:INPUT_CHECK:SAMPLESHEET_JSON (samplesheet.csv)                         [100%] 1 of 1 ✔
[b5/b4c2fd] process > PGSCATALOG_PGSCALC:PGSCALC:INPUT_CHECK:COMBINE_SCOREFILES (1)                                     [100%] 1 of 1 ✔
[-        ] process > PGSCATALOG_PGSCALC:PGSCALC:MAKE_COMPATIBLE:PLINK2_RELABELBIM                                      -
[22/e7b529] process > PGSCATALOG_PGSCALC:PGSCALC:MAKE_COMPATIBLE:PLINK2_RELABELPVAR (cineca chromosome 22)              [100%] 1 of 1 ✔
[-        ] process > PGSCATALOG_PGSCALC:PGSCALC:MAKE_COMPATIBLE:PLINK2_VCF                                             -
[e7/88f1aa] process > PGSCATALOG_PGSCALC:PGSCALC:MATCH:MATCH_VARIANTS (cineca chromosome 22)                            [100%] 1 of 1 ✔
[05/77f2fa] process > PGSCATALOG_PGSCALC:PGSCALC:MATCH:MATCH_COMBINE (cineca)                                           [100%] 1 of 1 ✔
[53/a5de11] process > PGSCATALOG_PGSCALC:PGSCALC:APPLY_SCORE:PLINK2_SCORE (cineca chromosome 22 effect type additive 0) [100%] 1 of 1 ✔
[4a/ffa4e6] process > PGSCATALOG_PGSCALC:PGSCALC:APPLY_SCORE:SCORE_AGGREGATE (cineca)                                   [100%] 1 of 1 ✔
[9b/a2dcf4] process > PGSCATALOG_PGSCALC:PGSCALC:REPORT:SCORE_REPORT (cineca)                                           [  0%] 0 of 1
executor >  local (9)
[ee/47fb5e] process > PGSCATALOG_PGSCALC:PGSCALC:INPUT_CHECK:SAMPLESHEET_JSON (samplesheet.csv)                         [100%] 1 of 1 ✔
[b5/b4c2fd] process > PGSCATALOG_PGSCALC:PGSCALC:INPUT_CHECK:COMBINE_SCOREFILES (1)                                     [100%] 1 of 1 ✔
[-        ] process > PGSCATALOG_PGSCALC:PGSCALC:MAKE_COMPATIBLE:PLINK2_RELABELBIM                                      -
[22/e7b529] process > PGSCATALOG_PGSCALC:PGSCALC:MAKE_COMPATIBLE:PLINK2_RELABELPVAR (cineca chromosome 22)              [100%] 1 of 1 ✔
[-        ] process > PGSCATALOG_PGSCALC:PGSCALC:MAKE_COMPATIBLE:PLINK2_VCF                                             -
[e7/88f1aa] process > PGSCATALOG_PGSCALC:PGSCALC:MATCH:MATCH_VARIANTS (cineca chromosome 22)                            [100%] 1 of 1 ✔
[05/77f2fa] process > PGSCATALOG_PGSCALC:PGSCALC:MATCH:MATCH_COMBINE (cineca)                                           [100%] 1 of 1 ✔
[53/a5de11] process > PGSCATALOG_PGSCALC:PGSCALC:APPLY_SCORE:PLINK2_SCORE (cineca chromosome 22 effect type additive 0) [100%] 1 of 1 ✔
[4a/ffa4e6] process > PGSCATALOG_PGSCALC:PGSCALC:APPLY_SCORE:SCORE_AGGREGATE (cineca)                                   [100%] 1 of 1 ✔
[9b/a2dcf4] process > PGSCATALOG_PGSCALC:PGSCALC:REPORT:SCORE_REPORT (cineca)                                           [  0%] 0 of 1 ✔
[1e/26167a] process > PGSCATALOG_PGSCALC:PGSCALC:DUMPSOFTWAREVERSIONS (1)                                               [100%] 1 of 1 ✔
ERROR ~ Error executing process > 'PGSCATALOG_PGSCALC:PGSCALC:REPORT:SCORE_REPORT (cineca)'

Caused by:
  Process `PGSCATALOG_PGSCALC:PGSCALC:REPORT:SCORE_REPORT (cineca)` terminated with an error exit status (1)

Command executed:

  echo nextflow run pgscatalog/pgsc_calc -profile test,singularity > command.txt
  echo "keep_multiallelic: false" > params.txt
  echo "keep_ambiguous   : false"    >> params.txt
  echo "min_overlap      : 0.75"       >> params.txt

  cp -r /root/.nextflow/assets/pgscatalog/pgsc_calc/assets/report/* .
  # workaround for unhelpful filenotfound quarto errors in some HPCs
  mkdir temp && TMPDIR=temp

  quarto render report.qmd -M "self-contained:true"         -P score_path:aggregated_scores.txt.gz         -P sampleset:cineca         -P run_ancestry:false         -P reference_panel_name:NO_PANEL

  cat <<-END_VERSIONS > versions.yml
  SCORE_REPORT:
      R: $(echo $(R --version 2>&1) | head -n 1 | cut -f 3 -d ' ')
  END_VERSIONS

Command exit status:
  1

Command output:
  (empty)

Command error:
  cp: cannot stat '/root/.nextflow/assets/pgscatalog/pgsc_calc/assets/report/*': No such file or directory

Work dir:
  /home/ubuntu/dir/run/test/singularity-3.8.7/work/9b/a2dcf45fcda299a7da326fb4ba0793

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`

 -- Check '.nextflow.log' file for details
ERROR ~ ERROR: No results report written!

 -- Check '.nextflow.log' file for details
Fiwx commented 8 months ago

Running without profile yields this error:

WARN: Singularity cache directory has not been defined -- Remote image will be stored in the path: /home/ubuntu/dir/run/test/work/work/singularity -- Use the environment variable NXF_SINGULARITY_CACHEDIR to specify a different location
ERROR ~ Error executing process > 'PGSCATALOG_PGSCALC:PGSCALC:REPORT:SCORE_REPORT (test)'

Caused by:
  Process `PGSCATALOG_PGSCALC:PGSCALC:REPORT:SCORE_REPORT (test)` terminated with an error exit status (1)

Command executed:

  echo nextflow run /home/ubuntu/dir/tools/pgsc_calc/main.nf -profile singularity --input /home/ubuntu/dir/run/test/samplesheet.csv --pgs_id PGS003724 --target_build GRCh37 --min_overlap 0.0 -c /home/ubuntu/dir/references/custom.config > command.txt
  echo "keep_multiallelic: false" > params.txt
  echo "keep_ambiguous   : false"    >> params.txt
  echo "min_overlap      : 0.0"       >> params.txt

  cp -r /home/ubuntu/dir/tools/pgsc_calc/assets/report/* .
  # workaround for unhelpful filenotfound quarto errors in some HPCs
  mkdir temp && TMPDIR=temp

  quarto render report.qmd -M "self-contained:true"         -P score_path:aggregated_scores.txt.gz         -P sampleset:test         -P run_ancestry:false         -P reference_panel_name:NO_PANEL

  cat <<-END_VERSIONS > versions.yml
  SCORE_REPORT:
      R: $(echo $(R --version 2>&1) | head -n 1 | cut -f 3 -d ' ')
  END_VERSIONS

Command exit status:
  1

Command output:
  (empty)

Command error:
  error: Could not create TypeScript compiler cache location: "/root/.cache/deno/gen"
  Check the permission of the directory.

Work dir:
  /home/ubuntu/dir/run/test/work/work/65/dee61a7e38166ca5e2b5d84ba7a833

Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`

 -- Check '.nextflow.log' file for details
ERROR ~ ERROR: No results report written!

 -- Check '.nextflow.log' file for details

Here is the samplesheet.csv:

sampleset,path_prefix,chrom,format
test,/home/ubuntu/dir/run/test/input_files/K_test,,vcf

I tried running as ubuntu instead of root, but got this:

  error: Could not create TypeScript compiler cache location: "/root/.cache/deno/gen"
nebfield commented 8 months ago

I suggest changing your Dockerfile to run as a normal user (not root).

From the error message it seems you might also need to add an explicit step in your dockerfile: mkdir -p $HOME/.cache/deno/gen. This issue might fix itself if you run as a normal user.

An alternative might be to run the test,conda profile in your Dockerfile. This way the image will contain pre-built conda environments, and will be ready to use. I'm pleasantly surprised singularity works inside docker so well.