Closed necrolyte2 closed 9 years ago
It appears this happens when the input file contains an odd number of sequences. Instead of placing the remainder sequences into the last file, it creates an additional split input file for blast. Then the pipeline just uses an additional blast process to process that file
I'm assuming the logic was that this last file should be very small so should not take a long time to process
Seems like maybe that last file should just be concatted to the last split file instead of incrementing ninst
[cmd] mkdir -p tmp_contig_2
[cmd] /media/VD_Research/Admin/PBS/Software/pathdiscov-4.2.2/pathdiscov/iterative_blast_phylo/par_block_blast.pl --outputdir tmp_contig_2 --inputfasta /home/AMED/tyghe.vallard/Issues/Issue_9777/VGD7/results/iterative_blast_phylo_1/2.contig.fasta --db /media/VD_Research/databases/ncbi/blast/nt/nt --blast_type dc-megablast --task dc-megablast --ninst 3 --outfile /home/AMED/tyghe.vallard/Issues/Issue_9777/VGD7/results/iterative_blast_phylo_1/2.contig.blast --outheader /home/AMED/tyghe.vallard/Issues/Issue_9777/VGD7/results/iterative_blast_phylo_1/blast.header --blast_options "-evalue 1e-4 -word_size 12"
wc_input_length = 29024
instances = 3
wc_split_length = 9674
wc_remainder = 2
remainder nonzero, increment ninst
wc_input_length = 29024
instances = 4
wc_split_length = 9674
wc_remainder = 2
[echo] parent process: submit child PID 29830
[cmd] /media/VD_Research/Admin/PBS/Software/pathdiscov-4.2.2/pathdiscov/iterative_blast_phylo/blast_wrapper.pl --type dc-megablast --query tmpsplit000 --db /media/VD_Research/databases/ncbi/blast/nt/nt --task dc-megablast --out blastout000 --options "-evalue 1e-4 -word_size 12"
[echo] parent process: submit child PID 29831
[cmd] /media/VD_Research/Admin/PBS/Software/pathdiscov-4.2.2/pathdiscov/iterative_blast_phylo/blast_wrapper.pl --type dc-megablast --query tmpsplit001 --db /media/VD_Research/databases/ncbi/blast/nt/nt --task dc-megablast --out blastout001 --options "-evalue 1e-4 -word_size 12"
[echo] parent process: submit child PID 29833
[cmd] /media/VD_Research/Admin/PBS/Software/pathdiscov-4.2.2/pathdiscov/iterative_blast_phylo/blast_wrapper.pl --type dc-megablast --query tmpsplit002 --db /media/VD_Research/databases/ncbi/blast/nt/nt --task dc-megablast --out blastout002 --options "-evalue 1e-4 -word_size 12"
[cmd] /media/VD_Research/Admin/PBS/Software/pathdiscov-4.2.2/pathdiscov/iterative_blast_phylo/blast_wrapper.pl --type dc-megablast --query tmpsplit003 --db /media/VD_Research/databases/ncbi/blast/nt/nt --task dc-megablast --out blastout003 --options "-evalue 1e-4 -word_size 12"
[start]
[cmd] blastn -query tmpsplit000 -db /media/VD_Research/databases/ncbi/blast/nt/nt -task dc-megablast -out blastout000 -outfmt "6 qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore" -max_target_seqs 10 -evalue 1e-4 -word_size 12
[start]
[cmd] blastn -query tmpsplit002 -db /media/VD_Research/databases/ncbi/blast/nt/nt -task dc-megablast -out blastout002 -outfmt "6 qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore" -max_target_seqs 10 -evalue 1e-4 -word_size 12
[start]
[cmd] blastn -query tmpsplit003 -db /media/VD_Research/databases/ncbi/blast/nt/nt -task dc-megablast -out blastout003 -outfmt "6 qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore" -max_target_seqs 10 -evalue 1e-4 -word_size 12
[deltat] 456
[end]
I just started a pipeline run with
-c 3
and param.txt has 3 in it, however, there are 4 blasn processes running instead of 3