Open ireneortega opened 2 years ago
Hi @ireneortega
I am not 100% sure what you are referring to when asking about headings in blastresult{query}.fasta
vs blastresult{subject}.fasta
, since both should have the headers of the respective query-id.
When you run the example:
orthologr::dNdS(query_file = system.file('seqs/ortho_thal_cds.fasta', package = 'orthologr'),
subject_file = system.file('seqs/ortho_lyra_cds.fasta', package = 'orthologr'),
delete_corrupt_cds = TRUE, # coding sequences that cannot be divided by 3 (triplets) will be removed
ortho_detection = "RBH", # perform BLAST best reciprocal hit orthology inference
aa_aln_type = "pairwise", # perform pairwise global alignments of AA seqs
aa_aln_tool = "NW", # using Needleman-Wunsch
codon_aln_tool = "pal2nal", # perform codon alignments using the tool Pal2Nal
dnds_est.method = "Comeron", # use Comeron's method for dN/dS inference
comp_cores = 1 )
And then check the _blast_db
folder in the tempdir()
path, you will see that headers are included.
I hope this helps you to troubleshoot on your side.
Cheers, Hajk
Ok, but I am referring to the heading of the BLAST output files that are generated when running:
blast_rec(...,
save.output = file.path)
The heading only appear if I save the output of the function blast_rec to directory.
Could you construct a small example using the system file datasets system.file('seqs/ortho_lyra_cds.fasta', package = 'orthologr')
? This will make it easier to trouble shoot on my end.
Cheers, Hajk
I run:
genome_ortho <- blast_rec(query_file = "C:/Users/irene/Documents/R/win-library/4.1/orthologr/seqs/ortho_lyra_cds.fasta",
subject_file = "C:/Users/irene/Desktop/prueba/200.fasta",
seq_type = "cds",
format = "fasta",
blast_algorithm = "blastn",
delete_corrupt_cds = FALSE,
save.output = file.path(working_dir)
)
readr::write_csv(genome_ortho, "C:/Users/irene/Desktop/prueba/ortho.tab", col_names = TRUE)
And the output files are:
You can see that headers are only printed in the ortho.tab
file and not in the blastresult_{query}.fasta
and blastresult_{subject}.fasta
files.
Ok, now I understand your question:
I suppose they are the same as output from blast_rec. Right?
Yes, the column names returned by blast_rec()
correspond to the columns in blastresult_{query}.fasta.csv
and blastresult_{subject}.fasta.csv
.
My confusion stems from the fact that you asked about headers in the fasta
file e.g. blastresult_{query}.fasta
, but you were referring to the columns of the corresponding output *.csv
file blastresult_{query}.fasta.csv
(which you also show in the command line screenshot).
I hope this clarifies everything.
Cheers, Hajk
Yes, this is what I meant. Thanks. Maybe it would be a good idea if you add the name of columns in both output files, just in case other user encounters the same issue.
Since the BLAST output format -6 is defined without header, I prefer to leave it this way for the default option, but I could add a new argument that allows you to export the column names as well. In addition, I can also document the columns which are a standard format for all BLAST hit tables.
When using
blast_rec
, headings in blastresult{query}.fasta_ and blastresult{subject}.fasta_ are not included. I suppose they are the same as output from blast_rec. Right?