issues
search
PROBIC
/
mSWEEP
mSWEEP High-resolution sweep metagenomics using fast probabilistic inference
MIT License
13
stars
2
forks
source link
mSWEEP-v1.4.0 (10 March 2020)
#5
Closed
tmaklin
closed
4 years ago
tmaklin
commented
4 years ago
mSWEEP-v1.4.0 (10 March 2020)
Beware the clichés of software naming edition.
New features
Support parallel processing through the '-t' flags with excellent scaling in larger problems.
Add possibility to match the input grouping indicators to the fasta file through the '--fasta' and '--groups-list' options.
Add the '--bootstrap-count' option which allows resampling fewer input alignments than the original sample contains.
Add possibility to specify the initial random seed for bootstrapping through the '--seed' option.
Support reading in files compressed with bz2 or lzma if compiled on a machine that supports them.
Better error checking
Validate that all input and output files exist and are accessible.
Add possibility to validate the input grouping indicators when using Themisto pseudoalignments (resolves #4 ).
Catch errors in several places that escaped in earlier versions.
More informative error messages in the above-mentioned cases.
More efficient resource usage
Parallel proceessing in the RCG optimization using OpenMP.
Memory usage reduced by ~40% and in large problems.
Single core performance increased by ~10% in large problems.
Better build pipeline
Download dependencies when running cmake.
Build without OpenMP if it is not supported.
More aggressive compiler optimization flags.
Support build and optimization with the Intel C compiler.
Internal changes
Improve code structure and legibility.
Use an external library (telescope) to read in pseeudoalignments from both kallisto or Themisto.
Better internal storage for the pseudoalignments.
Change the (rareish) reset step in the RCG optimization to be computationally more expensive but consume significantly less memory.
Separate bootstrap and regular sample processing classes.
mSWEEP-v1.4.0 (10 March 2020)
Beware the clichés of software naming edition.
New features
Better error checking
More efficient resource usage
Better build pipeline
Internal changes