tangerzhang / ALLHiC

ALLHiC: phasing and scaffolding polyploid genomes based on Hi-C data
174 stars 39 forks source link

How to speed up the "Optimize" steps #32

Closed timmy304681 closed 4 years ago

timmy304681 commented 4 years ago

Hello, Thanks for this great Hic pipeline. I have some trouble of the spending too much time for "Optimize" steps. My target genome is about 500 MB, and 11 chromosomes. But first I just tried K=16 to run partition, until now it spent real time 13 days (331 hours), cpu time is 384 hours, it is still running. I used Intel E7-8870 2.40GHz, 80 CPU cores, 2TB memory and PBS to summit jobs to run.

!/bin/bash

PBS -l nodes=1:ppn=40

So I wondered that is ALLHic forced to using only one core to run, or is there anything I could do to speed up this "Optimize" step. And I searched the previous issues, then I knew in my case using K=11 is expected, so I will rerun it again. If the information that I provide isn't enough, just let me know.

Thank you

tangerzhang commented 4 years ago

Hi Yang-Tui, Scaffolding a genome with 500 Mb size should be rather fast. 1-2 hours are expected in the optimize step. I would suggest you to check the data format, command lines and log file whether there is any error reported.