A single contig results but after annotation many genes are split into multiple segments, e.g. gene1_a, gene1_b, gene1_c will overlap in a region of the genome, three 500bp each rather than one 1500bp gene1 as is found in their relatives. I assume this is due to indels etc resulting from the assembly process.
The default parameters work fine, the indels should be corrected by polishing the final assembly with Arrow (or Quiver depending on how old your RSII data is).
I have RS II data and was wondering if there are parameter recommendations for bacterial genomes? I see recs for Sequel V2 on the FAQ but not RSII.
I ask because I ran Canu 1.9 with the default parameters on a 1.8MB genome w/956x coverage:
canu -p h11 -d canu1.9 genomeSize=1.8m -pacbio-raw ~/pacbio/h11/m170721_234925_42146_c101206462550000001823287110171766_s1_p0.*subreads.fastq
A single contig results but after annotation many genes are split into multiple segments, e.g. gene1_a, gene1_b, gene1_c will overlap in a region of the genome, three 500bp each rather than one 1500bp gene1 as is found in their relatives. I assume this is due to indels etc resulting from the assembly process.
Attached is the log. Thanks!