zengxiaofei / HapHiC

HapHiC: a fast, reference-independent, allele-aware scaffolding tool based on Hi-C data
https://www.nature.com/articles/s41477-024-01755-3
BSD 3-Clause "New" or "Revised" License
142 stars 10 forks source link

Omni-C support ? #10

Closed amvarani closed 9 months ago

amvarani commented 9 months ago

The Omni-C data is supported by HapHiC ?

zengxiaofei commented 9 months ago

Hi @amvarani,

Although I have not tested Omni-C data for HapHiC scaffolding yet, but I think it should be compatible. Using the default --RE GATC is fine, since this parameter has only little impact on the anchoring rate.

Best regards, Xiaofei

amvarani commented 9 months ago

Thanks @zengxiaofei

I have been testing HapHiC on a hexaploid plant genome. K-mer analysis indicates a pattern resembling AAAAAB in an assembled genome of 9.8Gb (N50 of 22Mb - largest contig 146mb). This means that I expect 1.6Gb per haplotype.

Do you have any tips for running HapHiC?"

zengxiaofei commented 9 months ago

The valid Hi-C reads after MAPQ filtering decrease with the ploidy level, which makes scaffolding such genome challenging. Currently, there are no specific tips for HapHiC. You could begin with the default parameters as a first step.