Closed peterthorpe5 closed 4 years ago
Hi Dom,
I have been using blobtools for a number of years now. I am assembly 4 fish genomes, Illumina only. So hundreds 000s contigs. ~1Gbp genomes.
blastn -task megablast -query scaffolds.fasta -db nt -outfmt '6 qseqid staxids bitscore std scomnames sscinames sblastnames sskingdoms stitle' -evalue 1e-20 -out n.clc.allfinal.out - num_threads 16
blobtools create -i scaffolds.fasta -s xr_scaff.sam -t n.clc.allfinal.out -o xr_V1.blobplots
This runs, for ~1.5 hours, then just stops:
blobtools create -i scaffolds.fasta -s xc.sam -t n.clc.allfinal.out -o test [+] Parsing FASTA - scaffolds.fasta [+] names.dmp/nodes.dmp not specified. Retrieving nodesDB from /conda/envs/python27/opt/blobtools-1.0.1/data/nodesDB.txt [%] 100% [+] Parsing tax0 - /storage/fish_genomes/xc/n.clc.allfinal.out
Then nothing. The node this is running on has 500GB RAM, it isnt running out of RAM.
head n.clc.allfinal.out (Seq, taxid, bit score .. the rest... ) NODE_1_length_118370_cov_12.234714 32473 1284 NODE_1_length_118370_cov_12.234714 XM_028031624.1 79.617 1933 282 NODE_1_length_118370_cov_12.234714 8083 1282 head *.sam @SQ SN:NODE_1_length_210002_cov_18.361000 LN:210002 @SQ SN:NODE_2_length_168846_cov_18.635259 LN:168846 @SQ SN:NODE_3_length_144837_cov_19.065304 LN:144837
head n.clc.allfinal.out (Seq, taxid, bit score .. the rest... ) NODE_1_length_118370_cov_12.234714 32473 1284 NODE_1_length_118370_cov_12.234714 XM_028031624.1 79.617 1933 282 NODE_1_length_118370_cov_12.234714 8083 1282
head *.sam @SQ SN:NODE_1_length_210002_cov_18.361000 LN:210002 @SQ SN:NODE_2_length_168846_cov_18.635259 LN:168846 @SQ SN:NODE_3_length_144837_cov_19.065304 LN:144837
Can you please share some wisdom on how to solve this?
it was a resource limitation problem ... split it into 100 fatsa files and it worked.
cheers
Hi Dom,
I have been using blobtools for a number of years now. I am assembly 4 fish genomes, Illumina only. So hundreds 000s contigs. ~1Gbp genomes.
blobtools v1.0
blast cmd
blastn -task megablast -query scaffolds.fasta -db nt -outfmt '6 qseqid staxids bitscore std scomnames sscinames sblastnames sskingdoms stitle' -evalue 1e-20 -out n.clc.allfinal.out - num_threads 16
create cmd
blobtools create -i scaffolds.fasta -s xr_scaff.sam -t n.clc.allfinal.out -o xr_V1.blobplots
This runs, for ~1.5 hours, then just stops:
Then nothing. The node this is running on has 500GB RAM, it isnt running out of RAM.
head of files:
Can you please share some wisdom on how to solve this?