drostlab / metablastr

Seamless Integration of BLAST Sequence Searches in R
https://drostlab.github.io/metablastr/
GNU General Public License v2.0
31 stars 8 forks source link

blast_best_reciprocal_hit nucleotide-protein comparison task error #10

Closed wcfung14 closed 1 year ago

wcfung14 commented 2 years ago

I ran the following code:

blast_test_reciprocal <- blast_best_reciprocal_hit(
    query   = 'A.fasta', ##protein sequence
    subject = 'B.fasta', ##nucleotide sequence
    search_type = "protein_to_nucleotide",
    task = "tblastn",
    evalue = 0.000001,
    output.path = tempdir(),
    db.import  = FALSE)

which gives the following result:


Starting 'tblastn -task tblastn' with  query: A.fasta and subject: B.fasta using 1 core(s) ...

BLAST search finished! The BLAST output file was imported into the running R session. The BLAST output file has been stored at: C:/Users/A_B_tblastn_eval_1e-06.blast_tbl
Error: Please choose a nucleotide-protein comparison task that is supported by BLAST: task = 'blastx' or task = 'blastx-fast'.

How to specify the second blast task ('blastx') when performing tblastn?

HajkD commented 2 years ago

Hi @wcfung14,

Thank you very much for making me aware of this issue!

When performing blast_best_reciprocal_hit(), users can now specify tasks in both directions when selecting search_type = protein_to_nucleotide (task = c("tblastn", "blastx")) or search_type = nucleotide_to_protein (task = c("blastx", "tblastn")).

Example:

blast_test_reciprocal <- blast_best_reciprocal_hit(
    query   = system.file('seqs/qry_aa.fa', package = 'metablastr'), # protein sequence
    subject = system.file('seqs/sbj_nn_best_hit.fa', package = 'metablastr'), # nucleotide sequence
    search_type = "protein_to_nucleotide",
    task = c("tblastn", "blastx"),
    evalue = 0.000001,
    output.path = tempdir(),
    db.import  = FALSE)
# A tibble: 20 x 21
# Groups:   query_id [20]
   query_id    subject_id perc_identity num_ident_match… alig_length
   <chr>       <chr>              <dbl>            <int>       <int>
 1 311313|PAC… AT1G01210…          95.3              102         107
 2 311315|PAC… AT1G01190…          94.2              502         533
 3 311317|PAC… AT1G01170…          85.6               83          97
 4 333544|PAC… AT1G01110…          93.6              494         528
 5 333551|PAC… AT1G01040…          92.0             1812        1970
 6 333554|PAC… AT1G01010…          72.1              328         455
 7 470155|PAC… AT1G01220…          96.7             1021        1056
 8 470156|PAC… AT1G01200…          95.8              228         238
 9 470161|PAC… AT1G01140…          98.5              446         453
10 470171|PAC… AT1G01090…          95.2              413         434
11 470177|PAC… AT1G01060…          89.2              578         648
12 470180|PAC… AT1G01030…          95.5              343         359
13 470181|PAC… AT1G01020…          91.1              224         246
14 909860|PAC… AT1G01180…          92.6              287         310
15 909871|PAC… AT1G01080…          87.3              262         300
16 909874|PAC… AT1G01050…         100                213         213
17 918854|PAC… AT1G01160…          84.9              152         179
18 918855|PAC… AT1G01150…          72.6              207         285
19 918858|PAC… AT1G01120…          99.2              525         529
20 918864|PAC… AT1G01070…          95.1              348         366
# … with 16 more variables: mismatches <int>, gap_openings <int>,
#   n_gaps <int>, pos_match <int>, ppos <dbl>, q_start <int>,
#   q_end <int>, q_len <int>, qcov <dbl>, qcovhsp <dbl>,
#   s_start <int>, s_end <int>, s_len <int>, evalue <dbl>,
#   bit_score <dbl>, score_raw <dbl>

Please let me know if this works for you after re-installing metablastr.

Best wishes, Hajk