Closed antonkulaga closed 6 years ago
Yes. Megahit is suitable for mammalian de novo assembly, though it generates only contigs.
@aquaskyline I can do scaffolding with Sealer or something similar. I am choosing a de novo assembler for the project now. At the benchmark done by your competitor, Abyss, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5411771/ I noticed that SoapDeNovo2 eats a huge amount of RAM but is more accurate than others. I am curious if Megahit can be treated as "SoapDeNovo2 that eats less memory and assembles mammalian genomes with same or better accuracy" or it is not the case yet?
Although Megahit consumes much less memory than SOAPdenovo2, you cannot use Megahit safely as a substitute to SOAPdenovo2. Although both Megahit and SOAPdenovo2 are suitable for mammalian de novo assembly, the results can be vastly different due to the disparate design rationale. SOAPdenovo2 tends to create more conservative thus shorter contigs to maximize the performance of scaffolding (because short contigs create less contiguity in scaffolding). Megahit makes contigs as its final output, thus creating much longer contigs than SOAPdenovo2. How the two tools really perform depends on the species and dataset, but it's more often the case that SOAPdenovo2 generates longer scaffold N50 than "Megahit+Another scaffolder".
By the way SOAPdenovo2 was known to consume as much or less memory than Abyss v1. You might want to try the option -a
to fix a memory size. If you know the peak memory consumption when -a
is not used, the best value for -a
would be the peak0.66. This option helps you to further decrease the memory needed. Note that there are some risks that peak0.66 is not enough and SOAPdenovo2 runs into Out Of Memory error.
Could you clarify if Megahit is usable also for mammalian de novo assembly?