deeptools / HiCExplorer

HiCExplorer is a powerful and easy to use set of tools to process, normalize and visualize Hi-C data.
https://hicexplorer.readthedocs.org
GNU General Public License v3.0
233 stars 70 forks source link

Runtime questions #670

Closed gdolsten closed 3 years ago

gdolsten commented 3 years ago

In the "Loop Detection using..." paper, it says that HiCExplorer runs in ~3 minutes. However, when I call hicDetectLoops it takes ~2 hours to run. What is the source of this discrepancy? Am I using it incorrectly?

joachimwolff commented 3 years ago

Hi,

first of all, please consider the mentioned paper is just a pre-print and is not peer-reviewed. Second, the results in the paper reflect the implementation we had for HiCExplorer version < 3.5. With version 3.5 we introduced many changes to the algorithm to increase its precision, however, we have not updated the pre-print paper so far.

What the causes could be: In the paper, we mention the hardware, it was a computer with 2x XEON E5-2630 v4 @ 2.20GHz i.e. we had 20 cores and 40 parallel threads available and we computed the result using 16 * 12 = 192 threads in parallel. If you don't have this massive parallel computing power and the requested main memory, the execution time will be longer. Second, we also defined that we use a maximal genomic distance of 2 million bases, if you have selected a different (larger) value, it will take longer. Third, the result was computed using interaction matrices in the cooler file format. If you use the native HiCExplorer format h5, the execution time will be significantly slower because of the better parallel loading support of cooler.

Best,

Joachim

joachimwolff commented 3 years ago

Reading now this issue: https://github.com/deeptools/HiCExplorer/issues/676. The paper states it uses 10 kb matrices and not 800 bp if your 2h runtimes question comes from using this matrix.

joachimwolff commented 3 years ago

No user response. Closing.