Closed farhan-phd closed 3 years ago
It's likely you're getting haplotypes separated in your assembly. I'd suggest using purge_dups as listed on the FAQ to remove the redundancy in the assembly. You can then use tools like KAT or Merqury to check if the assembly is indeed capturing both haplotypes and/or if purge_dups is working correctly.
Dear Canu developer,
I just finished my first genome draft of the canu assembly but the genome assembled is the almost double size (1.7 Gb) (evaluated by quast) than that expected/calculated using GenomeScope (0.9 Gb). Could you please have a look at my commands and parameters what could be the major problem? Please see the details below:
Assembly commads and parametres: canu-2.1/bin/canu -p Asta-latifasciata -d canu-01-assembly genomeSize=1g -pacbio-raw /00-raw-data_pacbio/ala-1b-Ge-Mus-M_PacBio.fastq correctedErrorRate=0.035 utgOvlErrorRate=0.065 trimReadsCoverage=2 trimReadsOverlap=500 > std.error 2> std.out &
Assembly corrected and trimmed reads (report) and genomeScope output attached below
Asta.report.txt GenomeScope_statistic.docx
Note: This assembled genome is of a cichlid fish with an extra (B) chromosome, which sequences are mainly duplicated from Autosomes. Is there any possibility that this extra chromosome might have caused this issue of genome size? Many thanks in advance, Best regards, Farhan