nigyta / dfast_core

DDBJ Fast Annotation and Submission Tool
77 stars 14 forks source link

Possible for faster speed? #43

Closed Tonny-zhou closed 2 years ago

Tonny-zhou commented 2 years ago

Dear author,

Thanks for this great bacterial annotation tools! When examining the source data of dfast, I found the threads used for blastp is only 1, Is there any reason for this?

Yours, Tonny

nigyta commented 2 years ago

Thank you for your interest. DFAST splits a query FASTA file into smaller pieces and then invoke multiple BLASTP processes concurrently with 1 cpu per process. In most cases, this is faster than using multiple CPUs for 1 BLASTP process, although it may less efficient in terms of memory usage.

nigyta

Tonny-zhou commented 2 years ago

Thank you for your interest. DFAST splits a query FASTA file into smaller pieces and then invoke multiple BLASTP processes concurrently with 1 cpu per process. In most cases, this is faster than using multiple CPUs for 1 BLASTP process, although it may less efficient in terms of memory usage.

nigyta

I'm sorry to bother you again... By viewing the source code, i noticed the process of orthosearch does not split the query FASTA file and reference FASTA file, and i found this is the most time-consuming process when i annotated a bacterial genome with pre-annotated reference file given to the parameter --references. Is anything could i do to shorten the annotation time?