bioforensics / yeat

YEAT: Your Everyday Assembly Tool
Other
1 stars 0 forks source link

Adding support for MEGAHIT #10

Closed danejo3 closed 2 years ago

danejo3 commented 2 years ago

MEGAHIT is an ultra-fast and memory-efficient (meta-)genome assembler.

This assembler would be another option that users can run with their reads for memory efficiency.

In a review paper comparing the various meta-genome assemblers out there, the authors wrote,

Overall, SPAdes, metaSPAdes, IDBA-UD and MEGAHIT displayed the best performances in assembling this metagenome of intermediate size and complexity, as they produced very high N50 values, a high proportion of long contigs and the widest assembly spans. While SPAdes was the best assembler overall, MEGAHIT was the most memory efficient, as it produced an assembly comparable to the best performing assemblers while using only a fraction of computational resources.

When deciding on which assembler to use based on the total RAM available, it is suggested that MEGAHIT is the choice assembler if there are < 500 GB of RAM available.

In addition to comparing MEGAHIT to SPADES, MEGAHIT is folds-faster than metaSPADES and SPADES.

Depending on time constraints, MEGAHIT is a great option over SPADES with minimal losses to the overall genome assembly.