Closed kysrpex closed 1 year ago
@bgruening This is the BLAST issue I commented this morning.
I assume #146 is related.
Currently the wrapper specifies BLAST+ version 2.10.1 here https://github.com/peterjc/galaxy_blast/blob/master/tools/ncbi_blast_plus/ncbi_macros.xml
If we have reason to believe that version of BLAST+ can't cope with the latest NCBI DB, then updating ought to solve this - and touch wood ought not to be too complicated (assuming not changes to the command line etc). i.e. Issue #146.
Has anyone tried to reproduce this at the command line outside of Galaxy? I can probably do that locally with a recent copy of NCBI NT from August/September 2023...
Confirming with our local copy of NT on Linux, BLAST 2.14.1 (current latest on bioconda) worked fine with the above command giving 500 hits (default limit), but after downgrading to BLAST 2.10.1 it crashes:
Error: NCBI C++ Exception:
T0 "/opt/conda/conda-bld/blast_1607337341665/work/blast/c++/src/serial/objistrasnb.cpp", line 499: Error: (CSerialException::eOverflow) byte 83: overflow error ( at [].[].gi)
T0 "/opt/conda/conda-bld/blast_1607337341665/work/blast/c++/src/serial/member.cpp", line 768: Error: (CSerialException::eOverflow) ncbi::CMemberInfoFunctions::ReadWithSetFlagMember() - error while reading seqid ( at Blast-def-line-set.[].[].seqid.[].[].gi)
Either version takes nearly an hour with 8 cores and 100GB allocated on our cluster:
time blastn -db $BLASTDB/nt -query query.fasta -task 'blastn' -evalue '0.001' -outfm t '6 std sallseqid score nident positive gaps ppos qframe sframe qseq sseq qlen slen salltitles' -out query_shared.tsv -num_threads 8
This is a strong reason to push a BLAST update for the wrappers.
Updated wrappers released via #157, this should be resolved now - closing issue.
Updated wrappers released via #157, this should be resolved now - closing issue.
Thanks!
The latest version of NCBI BLAST+ blastn available in this repository seems to be incompatible with the NCBI NT database from September 1, 2023. Below you may find the outputs of a job I launched myself on UseGalaxy.eu to reproduce the issue.
Command Line
Tool Standard Error
Tool Exit Code
The bug can be reproduced on UseGalaxy.eu using the following input [1],
and choosing "blastn" as "Type of BLAST".
You may import NCBI-BLAST-blastn-overflow-error-with-NCBI-NT-2023-09-01-Nucleotide-BLAST-database.rocrate.zip to save yourself the hassle of setting up the job.
According to a Stack Overflow post mentioning the same issue [1], the solution may be to update NCBI BLAST+ blastn.
[1] - https://stackoverflow.com/questions/70370949/local-blast-ncbi-c-exception