Closed stachyris closed 3 months ago
Hi Vinay,
Although this seems surprising, there may be various reasons for this. Longer reads often also have better quality; shorter reads are more likely to contain contamination if there is any. Are sizes of the assemblies any different? N50 is biased by assembly size, and NG50 would be a better metric to use. Finally, contiguity is not the only metric for assemebly quality.
Hi,
Thank you for the reply. Actually, no, the Genome size is the same: 1.056GB (Meryl+GenomeScope estimated to be 1.15GB) Got it. Will do some more inspections once we finish polishing and other steps in the pipeline.
Thank you. Best,
Hi there!
We recently assembled a bird genome with PromithION raw data of about 80GB(Reads N50 of ~11KB). When made with the full dataset (quality trimmed), we got a pretty good assembly with N50 of 13MB and about ~1500 contigs, but with the same dataset when assembled on a different machine with --asm_coverage 40 (due to RAM bottleneck on that machine) Flye (v2.9.3) made a better assembly with N50 of 19MB and ~1200 contigs.
I looked through the literature but could not figure out why so?! I understand that --asm_coverage uses only the partial dataset for the initial steps and complete dataset for further down steps, but still this jump from 13MB to 19MB is substantial improvement we felt and wanted to see if this has been observed before and what might be the reason for it.
Thank you, Vinay