deeptools / HiCExplorer

HiCExplorer is a powerful and easy to use set of tools to process, normalize and visualize Hi-C data.
https://hicexplorer.readthedocs.org
GNU General Public License v3.0
233 stars 70 forks source link

Question about --range argument for hicCorrelate #348

Closed kalavattam closed 5 years ago

kalavattam commented 5 years ago

Hi all,

Can you provide a little more background info for the hicCorrelate optional parameter --range?

So, for example, let's say a user wants to plot the correlation between two technical replicates from some human samples. Each rep is from a 4-cutter and was sequenced on a Illumina HiSeq 2500 lane, maybe 75-bp PE reads, and has been aligned and processed but not yet normalized via ICE, KR, etc. Each rep comprises ~175M valid Hi-C contacts and each has approximate max resolutions of ~25-40 kb (i.e., binning such that all bins have Hi-C contacts). What do you advise to set --range to (if you advise to set it at all)?

Thanks, Kris

gtrichard commented 5 years ago

--range might be useful after comparing the hicPlotDistVsCounts curves form multiple samples to determine a correct range of contacts distribution comparison.

hicCorrelate on the whole range might yield not-so-informative correlations, like you would get from comparing genome-wide ChIP-seq bedgraphs instead of peaks for example.

This is why it might be better to identify interesting ranges from hicPlotDistVsCounts and run several hicCorrelate with the selected ranges of interest (similar or dissimilar between samples).

Generally speaking, hicPlotDistVsCounts gives a much better understanding of how close Hi-C samples can be. But the output might be tricky to interpret if many samples need to be compared at once, this is when hicCorrelate becomes handy.

kalavattam commented 5 years ago

Thanks so much, Richard! Your explanation helps to contextualize these QC analyses.

Also, HiCExplorer is great--keep up the great work.