rrwick / Unicycler

hybrid assembly pipeline for bacterial genomes
GNU General Public License v3.0
558 stars 131 forks source link

Correct Existing Canu Assembly #125

Open Adamtaranto opened 6 years ago

Adamtaranto commented 6 years ago

Hi Ryan,

I'm trying to finish some mitochondrial genomes that have initially been assembled from PacBio reads with Canu. The Canu assemblies give me a single contig, but almost always fail to circularise correctly, leaving large duplicated flanks.

I've tried re-assembling reads that map to the Canu contig using Unicycler, but for some reason get worse results.

Is it possible to have Unicycler skip the raw read assembly step but still correct the duplicated flanks before polishing with racon? I see that the "--existing_long_read_assembly" option skips the racon polishing step.

Alternatively, if I manually clip the duplicated region out of the Canu assembly can I just use Unicycler to do the rotate/polish steps?

Thanks!

sjackman commented 6 years ago

I'm also assembling mitochondrial genomes!

You may want to try the Falcon assembler, which also yields a blunt 0-overlap circular genome.

I haven't tried it, but there's also Circlator for circularizing. https://github.com/sanger-pathogens/circlator

Try out unicycler_polish for polishing https://github.com/rrwick/Unicycler/blob/master/docs/unicycler-polish.md