tangerzhang / ALLHiC

ALLHiC: phasing and scaffolding polyploid genomes based on Hi-C data
170 stars 39 forks source link

Is this step ALLHiC_corrector necessary ? #123

Open zhangwenda0518 opened 2 years ago

zhangwenda0518 commented 2 years ago

Excuse me, is this step ALLHiC_corrector necessary?

Dear Professor,thank you for your excellent tool. For my simple diploid genome,(,2n=42) I got better results by Allhic. Genome is assembled from hifi data, about 500 contigs, the N50 is about 1Mb . Since then, I have tested two ways of mounting hic data: ALLHiC_pip.sh with _ALLHiCcorrector error correction and ALLHiC_pip.sh whithout _ALLHiC_correcto_r does not correct errors.

The resoult(k=21) : flye correct : 228M ; number= 108 N50: 10940958 N90: 8609932 no-correct :228M ; number= 35 N50: 11089944 N90: 9326830 hifiasm correct : 218M ; number= 115 N50: 10707401 N90: 8165142 no-correct :218M; number= 47 N50: 11446960 N90: 6934410

In the final result, no error correction seems to get the assembly result closer to chromosomes(K=21), and the heat map is better than error correction,and the heat mapof error correction display more than 22 (K=21)

I have been in this step for a long time. Could you please give me some advice on the choice of the result or how to evaluate the final result and for the unmounted contig,how we can do to deal it ?

Thanks for your reply 。Wish you all the best.

correct image image

no-correct image image

wangyibin commented 2 years ago

Hi, ALLHiC_corrector is developed for chimeric contig correction which will lead to misassembly for ALLHiC. I think simple diploid genome assembly from hifiasm could skip ALLHiC_corrector. Then, you can import no-correct result into juicebox to curation the misassembly.