Closed 4ctrl-alt-del closed 4 years ago
The output references a report file that I think is saved somewhere on NRP but I have no idea how to look for it let alone get it.
Attempted to run it again last night on NRP. It failed with a completely different error:
Error executing process > 'nr_index'
Caused by:
Host is unreachable (Host unreachable)
WARN: Killing pending tasks (2)
@4ctrl-alt-del The first error looks like nextflow itself had a segfault, this has happened to me before but I think it's usually something you can ignore and just run again. As for the second error, you can zoom in on the nr_index
process to see why this error is happening.
How do I "zoom in"?
@4ctrl-alt-del we can chat on zooming in as that's just debugging with Nextflow.
After several updates annotator still crashes on NRP but now with different symptoms since we met last week @spficklin .
Nextflow Output:
$ nextflow -C nextflow.config kuberun systemsgenetics/AnnoTater -v deepgtex-prp -config k8s
Pod started: sleepy-stonebraker
N E X T F L O W ~ version 19.07.0
Launching `systemsgenetics/AnnoTater` [sleepy-stonebraker] - revision: 3a6106836d [master]
NOTE: Your local project version looks outdated - a different revision is available in the remote repository [3fc8a01d8b]
General Information:
--------------------
Profile(s): standard
Container Engine: null
Input Files:
-----------------
Transcript (mRNA) file: /workspace/alucinor/examples/Citrus_sinensis-orange1.1g015632m.g.fasta
Data Files:
-----------------
InterProScan data: /workspace/alucinor/dbs/interproscan/interproscan-5.36-75.0/data
Panther data: /workspace/alucinor/dbs/panther/panther
NCBI nr data: /workspace/alucinor/dbs/nr
Uniprot SwissProt data: null
Output Parameters:
------------------
Output directory: /workspace/alucinor/output
WARN: The channel `create` method is deprecated -- it will be removed in a future release
WARN: The channel `create` method is deprecated -- it will be removed in a future release
WARN: The channel `create` method is deprecated -- it will be removed in a future release
[1d/66d9bf] Submitted process > uniprot_sprot_index
[57/216ad9] Submitted process > interproscan (1)
[57/216ad9] NOTE: Process `interproscan (1)` terminated for an unknown reason -- Likely it has been terminated by the external system -- Execution is retried (1)
[c2/4bf64d] Submitted process > interproscan (2)
[c2/4bf64d] NOTE: Process `interproscan (2)` terminated for an unknown reason -- Likely it has been terminated by the external system -- Execution is retried (1)
[07/a2081b] Submitted process > interproscan (3)
[07/a2081b] NOTE: Process `interproscan (3)` terminated for an unknown reason -- Likely it has been terminated by the external system -- Execution is retried (1)
[cc/46fac4] Submitted process > interproscan (4)
[cc/46fac4] NOTE: Process `interproscan (4)` terminated for an unknown reason -- Likely it has been terminated by the external system -- Execution is retried (1)
[37/030adf] Submitted process > interproscan (5)
[1d/66d9bf] NOTE: Process `uniprot_sprot_index` terminated for an unknown reason -- Likely it has been terminated by the external system -- Execution is retried (1)
[ef/8bb7c4] Submitted process > interproscan (6)
[37/030adf] NOTE: Process `interproscan (5)` terminated for an unknown reason -- Likely it has been terminated by the external system -- Execution is retried (1)
[5f/7be243] Submitted process > interproscan (8)
[ef/8bb7c4] NOTE: Process `interproscan (6)` terminated for an unknown reason -- Likely it has been terminated by the external system -- Execution is retried (1)
[51/2477cc] Submitted process > interproscan (7)
[51/2477cc] NOTE: Process `interproscan (7)` terminated for an unknown reason -- Likely it has been terminated by the external system -- Execution is retried (1)
[51/64ec62] Submitted process > interproscan (9)
[51/64ec62] NOTE: Process `interproscan (9)` terminated for an unknown reason -- Likely it has been terminated by the external system -- Execution is retried (1)
[87/5b15f3] Re-submitted process > interproscan (1)
[87/5b15f3] NOTE: Process `interproscan (1)` terminated for an unknown reason -- Likely it has been terminated by the external system -- Execution is retried (2)
[36/b3eac3] Re-submitted process > interproscan (2)
[36/b3eac3] NOTE: Process `interproscan (2)` terminated for an unknown reason -- Likely it has been terminated by the external system -- Execution is retried (2)
[ba/1fc40b] Re-submitted process > interproscan (3)
[ba/1fc40b] NOTE: Process `interproscan (3)` terminated for an unknown reason -- Likely it has been terminated by the external system -- Execution is retried (2)
[b0/eab735] Re-submitted process > interproscan (4)
[b0/eab735] NOTE: Process `interproscan (4)` terminated for an unknown reason -- Likely it has been terminated by the external system -- Execution is retried (2)
[ac/c9c649] Re-submitted process > uniprot_sprot_index
[ac/c9c649] NOTE: Process `uniprot_sprot_index` terminated for an unknown reason -- Likely it has been terminated by the external system -- Execution is retried (2)
[94/8ae03b] Re-submitted process > interproscan (5)
[94/8ae03b] NOTE: Process `interproscan (5)` terminated for an unknown reason -- Likely it has been terminated by the external system -- Execution is retried (2)
[b3/c6bdca] Re-submitted process > interproscan (6)
[b3/c6bdca] NOTE: Process `interproscan (6)` terminated for an unknown reason -- Likely it has been terminated by the external system -- Execution is retried (2)
[9c/6752db] Re-submitted process > interproscan (7)
[9c/6752db] NOTE: Process `interproscan (7)` terminated for an unknown reason -- Likely it has been terminated by the external system -- Execution is retried (2)
[93/a6465c] Re-submitted process > interproscan (9)
[93/a6465c] NOTE: Process `interproscan (9)` terminated for an unknown reason -- Likely it has been terminated by the external system -- Execution is retried (2)
[cc/7e335a] Re-submitted process > interproscan (1)
[cc/7e335a] NOTE: Process `interproscan (1)` terminated for an unknown reason -- Likely it has been terminated by the external system -- Execution is retried (3)
[bf/b77aac] Re-submitted process > interproscan (2)
[bf/b77aac] NOTE: Process `interproscan (2)` terminated for an unknown reason -- Likely it has been terminated by the external system -- Execution is retried (3)
[a2/beeb6b] Re-submitted process > interproscan (3)
[a2/beeb6b] NOTE: Process `interproscan (3)` terminated for an unknown reason -- Likely it has been terminated by the external system -- Execution is retried (3)
[65/b8a814] Re-submitted process > interproscan (4)
[5f/7be243] NOTE: Process `interproscan (8)` terminated for an unknown reason -- Likely it has been terminated by the external system -- Execution is retried (1)
[5b/572060] Re-submitted process > uniprot_sprot_index
[65/b8a814] NOTE: Process `interproscan (4)` terminated for an unknown reason -- Likely it has been terminated by the external system -- Execution is retried (3)
[7b/404af2] Re-submitted process > interproscan (5)
[5b/572060] NOTE: Process `uniprot_sprot_index` terminated for an unknown reason -- Likely it has been terminated by the external system -- Execution is retried (3)
[2c/93f504] Re-submitted process > interproscan (6)
[2c/93f504] NOTE: Process `interproscan (6)` terminated for an unknown reason -- Likely it has been terminated by the external system -- Execution is retried (3)
[c1/45e520] Re-submitted process > interproscan (7)
[c1/45e520] NOTE: Process `interproscan (7)` terminated for an unknown reason -- Likely it has been terminated by the external system -- Execution is retried (3)
[e6/556203] Re-submitted process > interproscan (9)
[e6/556203] NOTE: Process `interproscan (9)` terminated for an unknown reason -- Likely it has been terminated by the external system -- Execution is retried (3)
[af/60d69b] Re-submitted process > interproscan (1)
[7b/404af2] NOTE: Process `interproscan (5)` terminated for an unknown reason -- Likely it has been terminated by the external system -- Execution is retried (3)
[4d/04d4bb] Re-submitted process > interproscan (2)
Error executing process > 'interproscan (1)'
Caused by:
Process `interproscan (1)` terminated for an unknown reason -- Likely it has been terminated by the external system
Command executed:
# Call InterProScan on a single sequence.
/usr/local/interproscan/interproscan.sh -f TSV,XML --goterms --input Citrus_sinensis-orange1.1g015632m.g.1.fasta --iprlookup --pathways --seqtype n --cpu 2 --output-dir . --mode standalone --applications TIGRFAM,SFLD,SUPERFAMILY,Gene3D,Hamap,Coils,ProSiteProfiles,SMART,CDD,PRINTS,Pfam,MobiDBLite,PIRSF,PANTHER,ProDom
# Remove the temp directory created by InterProScan
rm -rf ./temp
Command exit status:
-
Command output:
(empty)
Command wrapper:
failed to open log file "/var/log/pods/deepgtex-prp_nf-af60d69b8398944dafaa6751f7bfd421_b4dc3618-aae1-42a6-9cbb-1111d4b8b591/nf-af60d69b8398944dafaa6751f7bfd421/0.log": open /var/log/pods/deepgtex-prp_nf-af60d69b8398944dafaa6751f7bfd421_b4dc3618-aae1-42a6-9cbb-1111d4b8b591/nf-af60d69b8398944dafaa6751f7bfd421/0.log: no such file or directory
Work dir:
/workspace/alucinor/work/af/60d69b8398944dafaa6751f7bfd421
Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`
WARN: Killing pending tasks (1)
[cd/c3d25f] Re-submitted process > interproscan (3)
The nextflow pods fail like so by listing the pods(edited to only show relevant ones):
nf-0078adb7054e494d4126beb5997be320 0/1 ContainerCannotRun 0
nf-07a2081bb14b91da55ea20cdfb83f7ce 0/1 ContainerCannotRun 0
nf-16cf64fed2f31a71c5b6a10074adb37d 0/1 ContainerCannotRun 0
nf-1d66d9bfa25327a55b199d486f390157 0/1 ContainerCannotRun 0
nf-1e3d82e0a449ce29490dd222d74bba7f 0/1 ContainerCannotRun 0
nf-1e3f38e6389bfab8b5769a4d320d00f8 0/1 ContainerCannotRun 0
nf-2c93f504a81c1d580efeaa1272b57323 0/1 ContainerCannotRun 0
nf-350f991eeb346fa80f424230df61aeba 0/1 ContainerCannotRun 0
nf-36b3eac3752b61ab4a014f431c374b61 0/1 ContainerCannotRun 0
nf-37030adf4bb3cf15eabcb17bc386c867 0/1 ContainerCannotRun 0
nf-37c39c333f5f8a0619af8524be89eef7 0/1 ContainerCannotRun 0
nf-3d3ca5e92338652b4a527072b913cb9b 0/1 ContainerCannotRun 0
nf-40a98fa862a3671be782de37c69ceb38 0/1 ContainerCannotRun 0
nf-4fbf7849e74c02f2db6116cd25e5fac6 0/1 ContainerCannotRun 0
nf-512477cc0a4695d3d5157ad772c9b5ee 0/1 ContainerCannotRun 0
nf-5164ec624049083330ae3472e66171bb 0/1 ContainerCannotRun 0
nf-57216ad99d5a2598c2d46ab686f42ed0 0/1 ContainerCannotRun 0
nf-5b572060f014127d19ce649de3f5fccd 0/1 ContainerCannotRun 0
nf-5d9129748cb8da1cff983f40ca105613 0/1 ContainerCannotRun 0
nf-5f7be243d47a2f3d423d308505f7033b 0/1 ContainerCannotRun 0
nf-60f900d6c4d4041765523e21e1900602 0/1 ContainerCannotRun 0
nf-65b8a814cfd1f45cd23de76df03e44a2 0/1 ContainerCannotRun 0
nf-74de832c08ae08cf5b54080a567d5d97 0/1 ContainerCannotRun 0
nf-7a8b4b7ae0069a01b833502cd214ad50 0/1 ContainerCannotRun 0
nf-7b404af203b317917cf042b697eac010 0/1 ContainerCannotRun 0
nf-7bdb137a1555f4569c38fea6e9ffa7f7 0/1 ContainerCannotRun 0
nf-835cafb1458000c2569223baa00fc988 0/1 ContainerCannotRun 0
nf-875b15f357a68b66c2433786d4a9ca06 0/1 ContainerCannotRun 0
nf-90f95c296a72831d0bf3293e27881192 0/1 ContainerCannotRun 0
nf-93a6465c0a758758cb6ee389939657e7 0/1 ContainerCannotRun 0
nf-948ae03b8896ed77c249625b45731225 0/1 ContainerCannotRun 0
nf-9653ded58c2ef48c0030bce88bd32737 0/1 ContainerCannotRun 0
nf-9c6752db58268bb36619ab9711d8a2a4 0/1 ContainerCannotRun 0
nf-a2beeb6b27374abeb8f8dfa773a29c95 0/1 ContainerCannotRun 0
nf-acc9c649c27e80707007f8a4252c6db5 0/1 ContainerCannotRun 0
nf-af60d69b8398944dafaa6751f7bfd421 0/1 ContainerCannotRun 0
nf-b0eab7351dcc264ba82358b25a0cae48 0/1 ContainerCannotRun 0
nf-b3c6bdca7c95b5462994caf0a1678443 0/1 ContainerCannotRun 0
nf-ba1fc40bee86602fdb360e253c65e813 0/1 ContainerCannotRun 0
nf-bfb77aac557543fcc057c13d03939a35 0/1 ContainerCannotRun 0
nf-c145e5205552aa9d3ffb03e1b65a21b4 0/1 ContainerCannotRun 0
nf-c24bf64d574036b12d465f5e257c3539 0/1 ContainerCannotRun 0
nf-c4936b7d5aa3ca588f991b61eadb01cd 0/1 ContainerCannotRun 0
nf-c55f1a09ca91e605b2cb7225b3fec9d6 0/1 ContainerCannotRun 0
nf-cc46fac44d29439581b7c628159f21e9 0/1 ContainerCannotRun 0
nf-cc7e335ac8ab3e7926ee0c5e9f0d7130 0/1 ContainerCannotRun 0
nf-d83a66947a8b5467aea005933f2175dc 0/1 ContainerCannotRun 0
nf-da702cde232c10e27e66c6cf8db8d296 0/1 ContainerCannotRun 0
nf-dd5a33d85229589098906ef25c6f25bf 0/1 ContainerCannotRun 0
nf-e64b4e480d77a0f670cbf1a06f83d5a2 0/1 ContainerCannotRun 0
nf-e65562036991862d43d5755bc0d2a7f6 0/1 ContainerCannotRun 0
nf-eb503542269fa919938d75a76b479f58 0/1 ContainerCannotRun 0
nf-edc41c1ce3186f9b953e129c35a68c95 0/1 ContainerCannotRun 0
nf-edf90d5b07682ff32dcd1ada9546d959 0/1 ContainerCannotRun 0
nf-ef8bb7c4338ec8c8b07f074b77d8ee2e 0/1 ContainerCannotRun 0
nf-f44fdf4f347459801c6213acc98cef40 0/1 ContainerCannotRun 0
nf-f85d06a143a58008ba732034f05b2d14 0/1 ContainerCannotRun 0
nf-faed0088f3308a96d0660ba3cfd82626 0/1 ContainerCannotRun 0
I attempted to run it twice and it produced the identical crash.
You can debug further by inspecting the output of an individual pod:
kubectl logs <pod-name>
And also by inspecting the work directory of the process that failed (at least the one that nextflow prints when it terminates):
cd /workspace/alucinor/work/af/60d69b8398944dafaa6751f7bfd421
ls -al
Also @4ctrl-alt-del make sure you delete any dangling pods that are left over by a workflow. Nextflow should delete them for you when it exits but sometimes it doesn't clean them up properly. Here's a command you can use to delete all of the "ContainerCannotRun" pods in batch:
kubectl delete pods $(kubectl get pods --no-headers | grep 'ContainerCannotRun' | awk '{ print $1 }')
kubectl logs
failed to open log file "/var/log/pods/deepgtex-prp_nf-f5c1244025c3be17f2b1a93ef8d61276_7bfa933c-1a09-40a5-a4b4-528ac1a0a142/nf-f5c1244025c3be17f2b1a93ef8d61276/0.log": open /var/log/pods/deepgtex-prp_nf-f5c1244025c3be17f2b1a93ef8d61276_7bfa933c-1a09-40a5-a4b4-528ac1a0a142/nf-f5c1244025c3be17f2b1a93ef8d61276/0.log: no such file or directory
Looking at /workspace/alucinor/work/...:
total 10
drwxr-xr-x 1 root root 3 Oct 28 20:07 .
drwxr-xr-x 1 root root 2 Oct 28 20:07 ..
-rw-r--r-- 1 root root 342 Oct 28 20:07 .command.log
-rw-r--r-- 1 root root 8606 Oct 28 20:07 .command.run
-rw-r--r-- 1 root root 130 Oct 28 20:07 .command.sh
cat .command.log:
failed to open log file "/var/log/pods/deepgtex-prp_nf-ca10407be0e24fb80121ed298ecb6d23_216a3a19-406f-4ab9-a847-4fbba65abeb6/nf-ca10407be0e24fb80121ed298ecb6d23/0.log": open /var/log/pods/deepgtex-prp_nf-ca10407be0e24fb80121ed298ecb6d23_216a3a19-406f-4ab9-a847-4fbba65abeb6/nf-ca10407be0e24fb80121ed298ecb6d23/0.log: no such file or directory
cat .command.run
#!/bin/bash
# NEXTFLOW TASK: uniprot_sprot_index
set -e
set -u
NXF_DEBUG=${NXF_DEBUG:=0}; [[ $NXF_DEBUG > 1 ]] && set -x
NXF_ENTRY=${1:-nxf_main}
nxf_tree() {
local pid=$1
declare -a ALL_CHILDREN
while read P PP;do
ALL_CHILDREN[$PP]+=" $P"
done < <(ps -e -o pid= -o ppid=)
pstat() {
local x_pid=$1
local STATUS=$(2> /dev/null < /proc/$1/status egrep 'Vm|ctxt')
if [ $? = 0 ]; then
local x_vsz=$(echo "$STATUS" | grep VmSize | awk '{print $2}' || echo -n '0')
local x_rss=$(echo "$STATUS" | grep VmRSS | awk '{print $2}' || echo -n '0')
local x_peak=$(echo "$STATUS" | egrep 'VmPeak|VmHWM' | sed 's/^.*:\s*//' | sed 's/[\sa-zA-Z]*$//' | tr '\n' ' ' || echo -n '0 0')
local x_pmem=$(awk -v rss=$x_rss -v mem_tot=$mem_tot 'BEGIN {printf "%.0f", rss/mem_tot*100*10}' || echo -n '0')
local vol_ctxt=$(echo "$STATUS" | grep '\bvoluntary_ctxt_switches' | awk '{print $2}' || echo -n '0')
local inv_ctxt=$(echo "$STATUS" | grep '\bnonvoluntary_ctxt_switches' | awk '{print $2}' || echo -n '0')
cpu_stat[x_pid]="$x_pid $x_pmem $x_vsz $x_rss $x_peak $vol_ctxt $inv_ctxt"
fi
}
pwalk() {
pstat $1
for i in ${ALL_CHILDREN[$1]:=}; do pwalk $i; done
}
pwalk $1
}
nxf_stat() {
cpu_stat=()
nxf_tree $1
declare -a sum=(0 0 0 0 0 0 0 0)
local pid
local i
for pid in "${!cpu_stat[@]}"; do
local row=(${cpu_stat[pid]})
[ $NXF_DEBUG = 1 ] && echo "++ stat mem=${row[*]}"
for i in "${!row[@]}"; do
if [ $i != 0 ]; then
sum[i]=$((sum[i]+row[i]))
fi
done
done
[ $NXF_DEBUG = 1 ] && echo -e "++ stat SUM=${sum[*]}"
for i in {1..7}; do
if [ ${sum[i]} -lt ${cpu_peak[i]} ]; then
sum[i]=${cpu_peak[i]}
else
cpu_peak[i]=${sum[i]}
fi
done
[ $NXF_DEBUG = 1 ] && echo -e "++ stat PEAK=${sum[*]}\n"
nxf_stat_ret=(${sum[*]})
}
nxf_sleep() {
sleep $1 2>/dev/null || sleep 1;
}
nxf_mem_watch() {
set -o pipefail
local pid=$1
local trace_file=.command.trace
local count=0;
declare -a cpu_stat=(0 0 0 0 0 0 0 0)
declare -a cpu_peak=(0 0 0 0 0 0 0 0)
local mem_tot=$(< /proc/meminfo grep MemTotal | awk '{print $2}')
local timeout
local DONE
local STOP=''
[ $NXF_DEBUG = 1 ] && nxf_sleep 0.2 && ps fx
while true; do
nxf_stat $pid
if [ $count -lt 10 ]; then timeout=1;
elif [ $count -lt 120 ]; then timeout=5;
else timeout=30;
fi
read -t $timeout -r DONE || true
[[ $DONE ]] && break
if [ ! -e /proc/$pid ]; then
[ ! $STOP ] && STOP=$(nxf_date)
[ $(($(nxf_date)-STOP)) -gt 10000 ] && break
fi
count=$((count+1))
done
echo "%mem=${nxf_stat_ret[1]}" >> $trace_file
echo "vmem=${nxf_stat_ret[2]}" >> $trace_file
echo "rss=${nxf_stat_ret[3]}" >> $trace_file
echo "peak_vmem=${nxf_stat_ret[4]}" >> $trace_file
echo "peak_rss=${nxf_stat_ret[5]}" >> $trace_file
echo "vol_ctxt=${nxf_stat_ret[6]}" >> $trace_file
echo "inv_ctxt=${nxf_stat_ret[7]}" >> $trace_file
}
nxf_write_trace() {
echo "nextflow.trace/v2" > $trace_file
echo "realtime=$wall_time" >> $trace_file
echo "%cpu=$ucpu" >> $trace_file
echo "rchar=${io_stat1[0]}" >> $trace_file
echo "wchar=${io_stat1[1]}" >> $trace_file
echo "syscr=${io_stat1[2]}" >> $trace_file
echo "syscw=${io_stat1[3]}" >> $trace_file
echo "read_bytes=${io_stat1[4]}" >> $trace_file
echo "write_bytes=${io_stat1[5]}" >> $trace_file
}
nxf_trace_mac() {
local start_millis=$(nxf_date)
/bin/bash -ue /workspace/alucinor/work/ca/10407be0e24fb80121ed298ecb6d23/.command.sh
local end_millis=$(nxf_date)
local wall_time=$((end_millis-start_millis))
local ucpu=''
local io_stat1=('' '' '' '' '' '')
nxf_write_trace
}
nxf_trace_linux() {
local pid=$$
local num_cpus=$(< /proc/cpuinfo grep '^processor' -c)
local tot_time0=$(grep '^cpu ' /proc/stat | awk '{sum=$2+$3+$4+$5+$6+$7+$8+$9; printf "%.0f",sum}')
local cpu_time0=$(2> /dev/null < /proc/$pid/stat awk '{printf "%.0f", ($16+$17)*10 }' || echo -n 'X')
local io_stat0=($(2> /dev/null < /proc/$pid/io sed 's/^.*:\s*//' | head -n 6 | tr '\n' ' ' || echo -n '0 0 0 0 0 0'))
local start_millis=$(nxf_date)
command -v ps &>/dev/null || { >&2 echo "Command 'ps' required by nextflow to collect task metrics cannot be found"; exit 1; }
/bin/bash -ue /workspace/alucinor/work/ca/10407be0e24fb80121ed298ecb6d23/.command.sh &
local task=$!
exec 10> >(nxf_mem_watch $task)
local mem_proc=$!
wait $task
local end_millis=$(nxf_date)
local tot_time1=$(grep '^cpu ' /proc/stat | awk '{sum=$2+$3+$4+$5+$6+$7+$8+$9; printf "%.0f",sum}')
local cpu_time1=$(2> /dev/null < /proc/$pid/stat awk '{printf "%.0f", ($16+$17)*10 }' || echo -n 'X')
local ucpu=$(awk -v p1=$cpu_time1 -v p0=$cpu_time0 -v t1=$tot_time1 -v t0=$tot_time0 -v n=$num_cpus 'BEGIN { pct=(p1-p0)/(t1-t0)*100*n; printf("%.0f", pct>0 ? pct : 0) }' )
local io_stat1=($(2> /dev/null < /proc/$pid/io sed 's/^.*:\s*//' | head -n 6 | tr '\n' ' ' || echo -n '0 0 0 0 0 0'))
local i
for i in {0..5}; do
io_stat1[i]=$((io_stat1[i]-io_stat0[i]))
done
local wall_time=$((end_millis-start_millis))
[ $NXF_DEBUG = 1 ] && echo "+++ STATS %CPU=$ucpu TIME=$wall_time I/O=${io_stat1[*]}"
echo "nextflow.trace/v2" > $trace_file
echo "realtime=$wall_time" >> $trace_file
echo "%cpu=$ucpu" >> $trace_file
echo "rchar=${io_stat1[0]}" >> $trace_file
echo "wchar=${io_stat1[1]}" >> $trace_file
echo "syscr=${io_stat1[2]}" >> $trace_file
echo "syscw=${io_stat1[3]}" >> $trace_file
echo "read_bytes=${io_stat1[4]}" >> $trace_file
echo "write_bytes=${io_stat1[5]}" >> $trace_file
echo 'DONE' >&10
wait $mem_proc 2>/dev/null || true
while [ -e /proc/$mem_proc ]; do nxf_sleep 0.1; done
[ ${NXF_OWNER:=''} ] && chown -fR --from root $NXF_OWNER /workspace/alucinor/work/ca/10407be0e24fb80121ed298ecb6d23/{*,.*} || true
}
nxf_trace() {
local trace_file=.command.trace
touch $trace_file
if [[ $(uname) = Darwin ]]; then
nxf_trace_mac
else
nxf_trace_linux
fi
}
nxf_date() {
local ts=$(date +%s%3N); [[ $ts == *3N ]] && date +%s000 || echo $ts
}
nxf_env() {
echo '============= task environment ============='
env | sort | sed "s/\(.*\)AWS\(.*\)=\(.\{6\}\).*/\1AWS\2=\3xxxxxxxxxxxxx/"
echo '============= task output =================='
}
nxf_kill() {
declare -a children
while read P PP;do
children[$PP]+=" $P"
done < <(ps -e -o pid= -o ppid=)
kill_all() {
[[ $1 != $$ ]] && kill $1 2>/dev/null || true
for i in ${children[$1]:=}; do kill_all $i; done
}
kill_all $1
}
nxf_mktemp() {
local base=${1:-/tmp}
if [[ $(uname) = Darwin ]]; then mktemp -d $base/nxf.XXXXXXXXXX
else TMPDIR="$base" mktemp -d -t nxf.XXXXXXXXXX
fi
}
on_exit() {
exit_status=${nxf_main_ret:=$?}
printf $exit_status > /workspace/alucinor/work/ca/10407be0e24fb80121ed298ecb6d23/.exitcode
set +u
[[ "$tee1" ]] && kill $tee1 2>/dev/null
[[ "$tee2" ]] && kill $tee2 2>/dev/null
[[ "$ctmp" ]] && rm -rf $ctmp || true
exit $exit_status
}
on_term() {
set +e
[[ "$pid" ]] && nxf_kill $pid
}
nxf_launch() {
/bin/bash /workspace/alucinor/work/ca/10407be0e24fb80121ed298ecb6d23/.command.run nxf_trace
}
nxf_stage() {
true
}
nxf_unstage() {
true
[[ ${nxf_main_ret:=0} != 0 ]] && return
}
nxf_main() {
trap on_exit EXIT
trap on_term TERM INT USR1 USR2
NXF_SCRATCH=''
[[ $NXF_DEBUG > 0 ]] && nxf_env
touch /workspace/alucinor/work/ca/10407be0e24fb80121ed298ecb6d23/.command.begin
set +u
set -u
[[ $NXF_SCRATCH ]] && echo "nxf-scratch-dir $HOSTNAME:$NXF_SCRATCH" && cd $NXF_SCRATCH
nxf_stage
set +e
local ctmp=$(set +u; nxf_mktemp /dev/shm 2>/dev/null || nxf_mktemp $TMPDIR)
local cout=$ctmp/.command.out; mkfifo $cout
local cerr=$ctmp/.command.err; mkfifo $cerr
tee .command.out < $cout &
tee1=$!
tee .command.err < $cerr >&2 &
tee2=$!
( nxf_launch ) >$cout 2>$cerr &
pid=$!
wait $pid || nxf_main_ret=$?
wait $tee1 $tee2
nxf_unstage
}
$NXF_ENTRY
cat .command.sh
diamond makedb --threads 2 --in /annotater/uniprot_sprot/uniprot_sprot.fasta --db uniprot_sprot
I am at a complete loss, am I typing the basic command wrong? This is what I run:
nextflow -C custom_nextflow.conf kuberun SystemsGenetics/AnnoTater -v deepgtex-prp -profile k8s
The containers now at least run in NRP. They immediately go to an error state because the diamond/interproscan programs fail. But at least now they have meaningful failure logs and get past container creation so I am closing this. A new issue has been made for the new type of annotater failure.
When attempting to run the orange example on NRP it failed after running for about 80 minutes with the following error: