Closed rpwang closed 5 years ago
Hi @rpwang , This problem is quite difficult to fix, especially when assembling a highly heterozygous genome. The reason is that the contig level assembly contains a large proportion of chimeric contigs and collapsed regions (See our paper in Nature Plants; Sup Figure 30). One possible solution is to perform a synteny analysis between the the first partition and a reference genome. The chimeric scaffolds should be observed and you can manually correct the large group. If you have parental DNA sequences, the best way is to phase pacbio reads using CANU trio-binning and assemble the two genomes separately. It is still a big challenge to assemble the heterozygous diploid genome. We are still developing new phasing methods which incorporate mapping based strategy and assembly based approach to phase diploid genome.
Hi @tangerzhang ,
Thank you for your reply! I shall take your suggestion into my next approach.
Hi @tangerzhang ,
I am working on a highly heterozygous diploid genome. We used long reads in the assembly. The assembly reference has been corrected for miss assemblies. We mapped the HiC reads and then the ALLHiC results showed that more than 40% of all contigs were placed in the first partition. Can I somewhere in the log or intermediate files observe why that is happening? Do you have any suggestions how to fix this?