zengxiaofei / HapHiC

HapHiC: a fast, reference-independent, allele-aware scaffolding tool based on Hi-C data
https://www.nature.com/articles/s41477-024-01755-3
BSD 3-Clause "New" or "Revised" License
141 stars 10 forks source link

OverflowError: signed integer is greater than maximum #81

Open lstxmu opened 1 month ago

lstxmu commented 1 month ago

1cluster_run.log HI, i am assemblying a insect genome about 11Gb , and run into the error issue, can you help me fix it ? Thank you.

zengxiaofei commented 1 month ago

A similar issue: https://github.com/zengxiaofei/HapHiC/issues/73

lstxmu commented 1 month ago

what can i do if i don't want to split the contig above 1Gb , is there any other fix method?

zengxiaofei commented 1 month ago

If you choose to rejoin the split contigs after scaffolding, don't worry, it will not impact the contiguity of your contigs. However, if you just find the process of splitting and joining contigs boring or don't know how to modify the Python code of HapHiC, the answer is currently no. I'm busy these days and unable to make these modifications or conduct tests. Sorry!

lstxmu commented 1 month ago

ok. other question, if i split the contig above 1GB, wheather the hic data mapping workflow should be rerun ?

zengxiaofei commented 1 month ago

Yes.

zengxiaofei commented 1 month ago

I have added the enhancement label. Although a contig longer than 1.07 Gb is not common, I will try to fix this problem when I have the time.

zengxiaofei commented 1 month ago

Additionally, considering that you are scaffolding a large genome, I would recommend upgrading HapHiC to the latest version (1.0.6), as this version has better compatibility with Juicebox for large genomes.