voutcn / megahit

Ultra-fast and memory-efficient (meta-)genome assembler
http://www.ncbi.nlm.nih.gov/pubmed/25609793
GNU General Public License v3.0
588 stars 134 forks source link

Better results with lower sequencing depth !? #321

Open soungalo opened 2 years ago

soungalo commented 2 years ago

I ran MEGAHIT on a 50x WGS data set (plant genome), and then on the same data set subsampled to 20x. Surprisingly, I got slightly higher N50 and larger assembly size with the 20x data set. BUSCO scores are very similar and very high for both. The same happened for several other similar data sets.
Any ideas or explanations for this?
I ran MEGAHIT like this:
megahit -1 reads_1.fq.gz -2 reads_2.fq.gz -r reads_merged.fq.gz,reads_SE.fq.gz -t 30 --min-contig-len 1