deeptools / HiCExplorer

HiCExplorer is a powerful and easy to use set of tools to process, normalize and visualize Hi-C data.
https://hicexplorer.readthedocs.org
GNU General Public License v3.0
233 stars 70 forks source link

How to exclude trans interaction from ICE correction #739

Closed cgirardot closed 3 years ago

cgirardot commented 3 years ago

Hi @joachimwolff

this is more of a question/check: in hicCorrectMatrix correct, I'd like to make sure the trans reads are ignored. It seems I can do this by setting --transCutoff 1. Did I get this correctly?

PS: the background is that I am trying to normalize & correct with cis interactions only. I got the normalization already (https://github.com/deeptools/HiCExplorer/issues/715)

joachimwolff commented 3 years ago

Hi @cgirardot,

With the 3.7 release you can remove the trans contacts with hicAdjustMatrix.

The --transCutoff parameter computes the highest value in the x percentile and set all higher values to this maximum value. In your case it would be 1 / 100 --> 0.01 https://github.com/deeptools/HiCExplorer/blob/master/hicexplorer/hicCorrectMatrix.py#L685 Leading to the 99.9 percentile:

high=0.01
max_inter = np.percentile(mat.data[dist_list == -1], (100 - high)) 

https://github.com/deeptools/HiCMatrix/blob/master/hicmatrix/HiCMatrix.py#L914

Your thought could work, however, only if the close to 0 percentile gives you 0.0 as a result. But to make a long story short: It does work. We expect as input for the threshold a value between 0 and 100, and divide this value by 100. In the next step, we compute the percentile by subtraction from 100. In all cases, the percentile value is always 100 - (x / 100) therefore 99 < x < 100.

Best,

Joachim

cgirardot commented 3 years ago

Hi @joachimwolff thank you for the answer. I am not so familiar with python code. Just to double-check I got your answer correctly, the setting --transCutoff 1 will result in ignoring most trans contacts during ICE? Thx again

joachimwolff commented 3 years ago

No, I wrote the exact opposite. --transCutoff 1 will lead to setting all values greater the value for the 99.9 percentile to exactly the value for the 99.99 percentile. It will not remove or ignore most trans contacts. That is not possible as I explained in my previous comment.

cgirardot commented 3 years ago

Ah. So I should set --transCutoff 99 to get all trans contacts set to a value close to 0 (the value will be that of the 1 percentile). Correct ?

joachimwolff commented 3 years ago

No, because a value of 99 will be converted to percentile = 100 - (99/100) equals 100 - 0.99 equals the percentile of 99.01 and not of 1 percentile. As written in the previous posts, only a percentile value of 99 < x < 100 will be possible to set.

cgirardot commented 3 years ago

ah ok so I can't do it this way

cgirardot commented 3 years ago

ah but I did not release the 3.7 was already out. I am gonna play with the new hicAdjustMatrix then!

cgirardot commented 3 years ago

sorry me again, do I get the doc right to remove the trans counts the cmd line would read like ? hicAdjustMatrix -a remove --interIntraHandling inter -m in.h5 -o out.h5

joachimwolff commented 3 years ago

You need to list the names of chromosomes that should be either kept or removed in general. The interIntraHandling parameter is independent of this. For example, if you want to remove the inter-data and keep the first four chromosomes, the command is as follows: hicAdjustMatrix -a keep --interIntraHandling inter -m in.h5 -o out.h5 --chromosomes chr1 chr2 chr3 chr4 To keep all chromosomes, you have to list them all.

cgirardot commented 3 years ago

ok great !