Bracken (Bayesian Reestimation of Abundance with KrakEN) is a highly accurate statistical method that computes the abundance of species in DNA sequences from a metagenomics sample.
1. Reduce memory overhead during the creation of database.kraken
It seeks to avoid any potential memory overhead incurred by having the find command concatenate the FASTAs and input that data to the classifier via process substitution.
2. Exit early if failed to build database.kraken
The change-set alters the behavior of the build script to finalize the name of the database only if the build process has successfully completed. This avoids potential false positives when running bracken-build in a loop as demonstrated below:
for i in {50,75,100}; do
bracken-build -d . -t 12 -l $i
done
Currently, if bracken-build fails for read length 50 the loop will still continue since database.kraken exists regardless of whether the previous invocation succeeded or not. This will have the cascading effect of read length 75 and 100 being processed on a potentially truncated database.
This PR seeks to address two issues:
1. Reduce memory overhead during the creation of
database.kraken
It seeks to avoid any potential memory overhead incurred by having the
find
command concatenate the FASTAs and input that data to theclassifier
via process substitution.2. Exit early if failed to build
database.kraken
The change-set alters the behavior of the build script to finalize the name of the database only if the build process has successfully completed. This avoids potential false positives when running
bracken-build
in a loop as demonstrated below:Currently, if
bracken-build
fails for read length50
the loop will still continue sincedatabase.kraken
exists regardless of whether the previous invocation succeeded or not. This will have the cascading effect of read length 75 and 100 being processed on a potentially truncated database.