bbuchfink / diamond

Accelerated BLAST compatible local sequence aligner.
GNU General Public License v3.0
1.03k stars 183 forks source link

About out.xml too big #380

Open sher-l opened 4 years ago

sher-l commented 4 years ago

CMD:/cluster/apps/diamond/diamond blastx -q unigene.fasta -d /database/blastdb/Nr/nr -f 5 -o blast.xml -p 96 --max-target-seqs 1 -e 1e-5 --block-size 50 --long-reads --index-chunks 1

There are 18G blast.xml (diamond blast) blastx just about 4G blast.xml (blastx) How could i make the blast.xml smaller?

bbuchfink commented 4 years ago

I'd recommend against using the XML format, but if you must you can try to compress the file with gzip, other than that I'm not sure how you would get it smaller.

sher-l commented 4 years ago

I'd recommend against using the XML format, but if you must you can try to compress the file with gzip, other than that I'm not sure how you would get it smaller.

Maybe My problem description is not accurate,I mean how to make the results less? Just like blastx have -num_alignments,how can I control the diamond blast number? i try use --max-target-seqs 1,but the number also have 20 - 30 blastx -num_alignments can control the number, -num_alignments 10 ,the number 10. Sorry about my pool english

bbuchfink commented 4 years ago

I see, the --long-reads option is overriding your --max-target-seqs setting here. Don't use this but --range-culling -F15 instead. With --range-culling you will still get multiple hits for a query if they span different ranges, so you can remove that too if you don't want it.

sher-l commented 4 years ago

I see, the --long-reads option is overriding your --max-target-seqs setting here. Don't use this but --range-culling -F15 instead. With --range-culling you will still get multiple hits for a query if they span different ranges, so you can remove that too if you don't want it.

Oh, I see, thank you so much. And I used '--top' instead '--max-target-seqs', it seem useful.