Oshlack / MINTIE

Method for Identifying Novel Transcripts and Isoforms using Equivalence classes, in cancer and rare disease.
MIT License
34 stars 7 forks source link

One or more parallel stages aborted #8

Closed yanlina0205 closed 3 years ago

yanlina0205 commented 3 years ago

I have installed the tools sucessfully, but I can't run more than 1 simultaneous case as bpipe. Everytime when several cases run to the stage assemble simultaneously, only the last processed case will continue to run,while others will stop with the details below:

====================== Pipeline Failed ===========================

One or more parallel stages aborted. The following messages were reported:

---------------------------------------- assemble  ( 14 )  -----------------------------------------

Command in stage assemble failed with exit status = 137 : 

rlens=`zcat 14/trim1.fastq.gz 14/trim2.fastq.gz                        | awk -v mrl=76 'BEGIN {minlen = mrl; maxlen = 0} {                             if (NR % 4 == 2) {                                 rlen = length($1) ;                                 if (rlen > maxlen) {maxlen = rlen}                                 if (rlen < minlen) {minlen = rlen}                             }} END {print minlen" "maxlen}'` ;             min_rlen=${rlens% *} ;             max_rlen=${rlens#* } ;              if [ ! -d /asnas/wangqf_group/yanln/MINTIE/14/14/SOAPassembly ]; then                 mkdir /asnas/wangqf_group/yanln/MINTIE/14/14/SOAPassembly ;             fi ;             cd /asnas/wangqf_group/yanln/MINTIE/14/14/SOAPassembly ;              echo "max_rd_len=$max_rlen" > config.config ;             echo -e "[LIB]\nq1=../../14/trim1.fastq.gz\nq2=../../14/trim2.fastq.gz" >> config.config ;             if [ -e SOAP.fasta ]; then rm SOAP.fasta ; fi ;             for k in 29 49 69 ; do                 if [ $k -gt $min_rlen ]; then                     echo "WARNING: Kmer size $k exceeds minimum read length ${min_rlen}. Please double check parameters." ;                 else                     /xtdisk/liuxin_group/suipp/ITD/RNA_seq/MINTIE-v0.3.0/tools/bin/soapdenovotrans pregraph -s config.config -o outputGraph_$k -K $k -p 6 ;                     /xtdisk/liuxin_group/suipp/ITD/RNA_seq/MINTIE-v0.3.0/tools/bin/soapdenovotrans contig -g outputGraph_$k ;                     cat outputGraph_$k.contig | sed "s/^>/>k${k}_/g" >> SOAP.fasta ;                 fi ;             done ;              cd ../../ ;             /xtdisk/liuxin_group/suipp/ITD/RNA_seq/MINTIE-v0.3.0/tools/bin/dedupe in=14/SOAPassembly/SOAP.fasta out=stdout.fa threads=6 overwrite=true |                 /xtdisk/liuxin_group/suipp/ITD/RNA_seq/MINTIE-v0.3.0/tools/bin/fasta_formatter |                 awk '!/^>/ { next } { getline seq } length(seq) > 100 { print $0 "\n" seq }' > 14/14_denovo_filt.fasta ;             if [ ! -s 14/14_denovo_filt.fasta ] ; then                 rm 14/14_denovo_filt.fasta ;                 echo "ERROR: de novo assembled contigs fasta file is empty." ;                 echo "Please check paths for SOAPdenovoTrans, dedupe and fasta" ;                 echo "formatter are correct, and their dependencies are installed." ;             fi ;

----------------------------------------------------------------------------------------------------

Use 'bpipe errors' to see output from failed commands.

The content of the parameter file is:

-p threads=6
-p assembly_mem=100
-p assembler=soap
-p scores=33
-p min_read_length=76
-p min_contig_len=100
-p minQScore=20
-p Ks=29,49,69
-p min_gap=3
-p min_clip=20
-p min_match=30,0.3
-p min_logfc=2
-p min_cpm=0.1
-p fdr=0.05
-p sort_ram=4G
-p gene_filter=
-p var_filter=
-p splice_motif_mismatch=1
-p fastqCaseFormat=cases/%_R*.fastq.gz
-p fastqControlFormat=controls/%_R*.fastq.gz
-p assemblyFasta=
-p run_de_step=true

I don't know why I can't run several ceses simultaneously, looking forward to your reply. Thanks.

mcmero commented 3 years ago

It looks like you may be running out of memory when running multiple assembly jobs. Are you running on a dedicated machine or through a cluster system? In general, I would not recommend running multiple cases in parallel unless you are on a cluster environment where you can spawn multiple jobs as assembly can be quite memory intensive (up to 180G in some cases, depending on the sample). Also, note that the assembly_mem parameter is only used if you're running the Trinity or rnaSPAdes assemblers (soapdenovotrans doesn't let you specify a memory cap).

yanlina0205 commented 3 years ago

Thank you. I run through a cluster system and I can spawn multiple jobs, but it seems that there is not enough memory for me. I will reinstall and rerun it with the Trinity or rnaSPAdes assemblers.