bio-raum / FooDMe2

A nextflow pipeline for the identification of species from mixed samples based on mitochondrial amplicons
https://bio-raum.github.io/FooDMe2/
GNU General Public License v3.0
1 stars 1 forks source link

[Improvement] Bump BLAST+ to last version anduse new feature of non-leaf taxid search to simplify workflow #41

Closed gregdenay closed 3 months ago

gregdenay commented 3 months ago

BLAST+ 2.15.0: October 31, 2023 New features

You can limit your search by a non-leaf taxID (e.g., all bacteria) without running a helper script.

Which means we could simplify database filtering to use the node filter directly in the blast CLI

Prfomance drop in v2.16?

gregdenay commented 3 months ago

Actually a bad idea to skip taxonomy simplification as the resulting JSON is used several times after. Keeping it bloating negatively impacts performances. Also:

 -taxids <String>
   Restrict search of database to include only the specified taxonomy IDs and
   their descendants (multiple IDs delimited by ',')
    * Incompatible with:  gilist, seqidlist, taxidlist, negative_gilist,
   negative_seqidlist, negative_taxids, negative_taxidlist, remote, subject,
   subject_loc

which means no taxid-base blocklist...

Current system should stay in place, it's possible to bump BLAST to 2.16.0 anyways, I measured no changes in perfs on the test set