Closed pcantalupo closed 6 years ago
Adding the std
keyword fixed the integer value problem in the desc
field (for BLAST 2.6.0) but then blastentropy.pl
was encountering an error. This was fixed by using the latest viral.1.1.genomic database which only contains ACC.VER fasta ids.
When using BLAST+ => 2.5.0, the Description values in the Report are integer values. When I use BLAST+ <= 2.3.0 (didn't try 2.4), the Descriptions are as I expect (i.e.
Tobacco mosaic virus, complete genome
)This stems from a change in how
blastdbcmd
extracts fasta sequences from the blast database. When this fullgigi|9626125|ref|NC_001367.1|
is used withblastdbcmd
, it extracts the followingfor BLAST >= 2.5.0
for BLAST <= 2.3.0
Then when Reann.pm reads the fasta file it uses the fasta identifier, which is an ACC.VER for BLAST+ >= 2.5, to update the %acc hash. With ACC.VER, it is creating a new hash key with the description. This new key (the ACC.VER) is never accessed again because later in the code, the fullgi from the Report file (the Accession field) is used to look up what is expected to be the Description. However, the hash value is the number of times the fullgi was found in the Report file (due to this line)
To test, use sewageseqs16.fa with
BLAST+ 2.3.0 (Descriptions are OK)
BLAST+ 2.6.0 (Descriptions are not ok; integer values)