drostlab / metablastr

Seamless Integration of BLAST Sequence Searches in R
https://drostlab.github.io/metablastr/
GNU General Public License v2.0
31 stars 8 forks source link

function blast_protein_to_protein with argument is.subject.db = TRUE doesn't work with nr database #6

Closed cparsania closed 1 year ago

cparsania commented 4 years ago

I have nr blast database downloaded from NCBI, which contains the files given in the attached snapshot. When I run the command below, it throws the as shown. I wonder, what input should I give as a blast-able database ?

blast_test <- blast_protein_to_protein(
        query   =  "aa_query.fasta",
        subject = "path/to/nr/db/nr",
        is.subject.db = TRUE,
        output.path = tempdir(),
        db.import  = FALSE ,cores = 4)

Error in .Call2("new_input_filexp", filepath, PACKAGE = "XVector") : 
  cannot open file '/Users/chiragparsania/Documents/Database/nr_protein_db/nr'

image

HajkD commented 4 years ago

Hi @cparsania

Many thanks for this input.

There will be functions named blast_protein_to_ncbi_nr(), blast_nuceotide_to_ncbi_nt(), etc which I haven't implemented fully yet. So this NCBI nr BLAST feature will be available soon.

I intentionally wanted all blast_*_to_*() functions to be between a query fasta against a subject fasta file to support low-level comparisons and then build on top of them larger search requests against large sequence databases.

I hope this helps and I will try to make sure that this feature will be available soon.

Best wishes, Hajk

cparsania commented 4 years ago

I intentionally wanted all blast_to() functions to be between a query fasta against a subject fasta file

However, the documentation for argument subject is path to subject file in fasta format or blast-able database, which I interpreted as either fasta file or blast format database created using makeblastdb command. Do you think is it worth mentioning argument subject takes fasta only ?

--- Edit

Also, there is an argument is.subject.db which says that logical specifying whether or not the subject file is a file in fasta format (is.subject.db = FALSE; default) or a blast-able database that was formatted with makeblastdb (is.subject.db = TRUE) , which means making this to TRUE allow users to supply nr db. Is that right ?

HajkD commented 4 years ago

Hi @cparsania

Many thanks for coming back to me on this one.

Also, there is an argument is.subject.db which says that logical specifying whether or not the subject file is a file in fasta format (is.subject.db = FALSE; default) or a blast-able database that was formatted with makeblastdb (is.subject.db = TRUE) , which means making this to TRUE allow users to supply nr db. Is that right ?

I now see the confusion and I am very sorry for not being more clear here.

What I meant by is.subject.db is that in cases where users already used makeblastdb to convert their *.fasta file into a blast-able database then they could also use this fasta-database with these functions.

But now I see that this can be misunderstood and I will make this more clear in the documentation. Also, once I implemented the functions to blast against e.g. NCBI nr etc, I will make a reference in the documentation at this point to these new functions.

Thank you very much for your feedback and this allows me to improve the metablastr package.

Best wishes, Hajk

katievigil commented 1 year ago

Hi I am getting a similar error:

sealionfeces <- readDNAStringSet(system.file("H:/ONRdolphinsealionpooled/sealionfecespooled/canu/medaka/consensus.fasta",

  • package = "rBLAST")) Error in .Call2("new_input_filexp", filepath, PACKAGE = "XVector") : cannot open file ''

@HajkD

HajkD commented 1 year ago

You need to remove system.file(...) around your path. This isn't a similar error as stated above.