I have ~21X PacBio CCS reads and have produced multiple assemblies using hifiasm, flye, and wtdbg2. The number of contigs in each assembly range from 4,965 (hifiasm) to 20,315 (flye), but wtdbg2 produces the best overall assembly (5,004 contigs; N50: 2,788,165; largest contig: 14,599,089). However, I would like to improve this assembly and would appreciate advice on parameters.
The species is a hammerhead shark with genome size ~2.7 Gbp; sharks have very repetitive genomes. I have used wtdbg2 presets 1, 3, and 4, and preset 4 produced the assembly with the fewest contigs. Adding -L 5000 to -x ccs marginally reduced the number of contigs. My code is below.
I have ~21X PacBio CCS reads and have produced multiple assemblies using hifiasm, flye, and wtdbg2. The number of contigs in each assembly range from 4,965 (hifiasm) to 20,315 (flye), but wtdbg2 produces the best overall assembly (5,004 contigs; N50: 2,788,165; largest contig: 14,599,089). However, I would like to improve this assembly and would appreciate advice on parameters.
The species is a hammerhead shark with genome size ~2.7 Gbp; sharks have very repetitive genomes. I have used wtdbg2 presets 1, 3, and 4, and preset 4 produced the assembly with the fewest contigs. Adding -L 5000 to -x ccs marginally reduced the number of contigs. My code is below.
Please advise on parameters to tweak and if i should polish between the steps above.
Thanks!