marbl / canu

A single molecule sequence assembler for genomes large and small.
http://canu.readthedocs.io/
653 stars 179 forks source link

Assembly autopolyploid genome #1440

Closed Xuelei-Dai closed 5 years ago

Xuelei-Dai commented 5 years ago

Hello, canu software can assemble homologous polyploid genome to get contig, if can, how to set the parameters? Best wishes

skoren commented 5 years ago

Sure, but it depends on the polyploidy. Often the genomes are diverged enough that you get a larger genome (e.g. wheat is 3 genomes but assemblies come out close to 3 times the size since most of the loci are diverged enough to be separated). This typically works with than about 2-3% divergence. You just need sufficient coverage for each of the genomes in this case so compute coverage and provide genome size to canu as the full size (e.g. for wheat it would be 16gb). If it is less diverged but you have access to the ancestor species, you could potentially use the trio approach but with the ancestors as the "parents".

The second question would be the heterozygosity between haplotypes, if that's also relatively high (over 0.5%) then use the heterozygous parameters from the FAQ or, even better, a trio. You could combine this with the above ancestor approach and get partitions both by haplotype and genome.

Xuelei-Dai commented 5 years ago

Thank you very much!