chhylp123 / hifiasm

Hifiasm: a haplotype-resolved assembler for accurate Hifi reads
MIT License
505 stars 84 forks source link

inconsistent between hic reads and hifi assembly #160

Open bilibilij opened 2 years ago

bilibilij commented 2 years ago

Hi, We have assemblied a plant genome ( ~ 400M ) using typical command of HiC mode in hifiasm and then scaffolded the contigs of hap1 and hap2 seperately using 3D-DNA. In our previous knoledge, the region pointed in the blue line of the first picture was a misassembly and HIC reads support this region seems to put into the end of the chromosome like the second picture. Is it a misassembly Or some region that hic reads can't map to the genome uniquely? edd88c5c7d434d28f1a30bbb8aca9e1 0862f3a8e2c7567808e7dde5017bf46

chhylp123 commented 2 years ago

If you scaffold hap1 and hap2 together, how does the result look like?

bilibilij commented 2 years ago

Thank you for your reply, I will have a try!

bilibilij commented 2 years ago

Hi, @chhylp123 I have scaffold hap1 and hap2 together using 3d-dna. The first is the initial plot of 0.hic and 0.assembly. 336320fa3e476751119c5b820e4b56a The second is the plot that we have shifted the possible (to our previous knowledge) misaasembly to the right position. 9c1100dec46dd16bed1f56bcb35a086 Which picture should be right?

chhylp123 commented 2 years ago

It's weird. Is a part of your sample polyploidy?

bilibilij commented 2 years ago

Our sample is diploid of Populus genus. Our previous study shows that the genome is with about 40% repetitive sequences and 1.8% heterozygosity. And about 50% chromosomes encounter this kind of confusion. Do you recommend us to align the HIFI reads to this region using minimap to see which is right?

chhylp123 commented 2 years ago

It would be good to check missassemblies by HiFi read alignment. From the Hi-C heatmap, it is hard to say.

bilibilij commented 2 years ago

Hi, @chhylp123 We have checked hifi reads support and collinearity between Populus trichocarpa which is assemblied by Pacbio CLR and is gold-standard in our genus. a7e8228280642391ba56d4a4ec9598c The blue arrows have pointed the region we have checked hifi reads support. fc71c1b9f5abe6b3675e1e5d4e664a7

c3d10c02f97f0bad710892290b244cc 9652d0e1040759c4a98db887b11f516

994a627cfe803c59bdef38399a2f3be

Hifi reads does not show a coverage breakpoint but there are some region covered by mult-mapped reads. Is it a misaasembly ?