deeptools / HiCExplorer

HiCExplorer is a powerful and easy to use set of tools to process, normalize and visualize Hi-C data.
https://hicexplorer.readthedocs.org
GNU General Public License v3.0
233 stars 70 forks source link

Using external TAD definitions for hicDifferentialTAD #722

Closed dawe closed 3 years ago

dawe commented 3 years ago

Welcome to the HiCExplorer GitHub repository! Before opening the issue please check that the following requirements are met :

$ hicInfo --version
hicInfo 3.6
$ python --version
Python 3.8.10

Retry your command, is it solved now? If not please continue with the following:

$  hicDifferentialTAD -tm 164_normalized_10000_KR.cool -cm 168_normalized_10000_KR.cool -td tmp -o test -t 1 -m intra-TAD 
ERROR:hicmatrix.HiCMatrix:Index error
Traceback (most recent call last):
  File "/home/cittaro.davide/miniforge3/envs/hic/lib/python3.8/site-packages/hicmatrix/HiCMatrix.py", line 269, in getRegionBinRange
    startbin = sorted(self.interval_trees[chrname][startpos:startpos + 1])[0].data
IndexError: list index out of range
ERROR:hicexplorer.hicDifferentialTAD:'NoneType' object is not subscriptableTraceback (most recent call last):
  File "/home/cittaro.davide/miniforge3/envs/hic/lib/python3.8/site-packages/hicexplorer/hicDifferentialTAD.py", line 193, in computeDifferentialTADs
    right_boundary_index_target = hic_matrix_target_inter_tad.getRegionBinRange(str(chromosom), row[2], row[2])[0]
TypeError: 'NoneType' object is not subscriptable

Hi all, I have been trying to use hicDifferentialTAD on some data of mine. First of all, it works fine when I feed it with domains.bed produced with hicFindTADs, so I guess it is not an issue with the data themselves. I'd like to test TAD produced by some collaborators (defined with ArrowHead), to do so I just take their file and produce a 6-columns file to feed hicDifferentialTAD. I get the error pasted above. I tried to isolate the offending line and I found that, for example,

I tried to use pdb to debug this, but I'm having some issues with queues. Any idea on how to manage this issue? Side question: is it possible to use any set of given intervals as input for hicDifferentialTAD (assuming the intervals make sense, of course).

joachimwolff commented 3 years ago

Hi,

you get this error if you try to access a region that is not in the matrix. Therefore your statement interval at line n+1 is perfectly defined interval on the chromosome makes me wonder. Can you validate this for both matrices? Especially if the regions come from a different algorithm. Was the identical interaction matrix used to compute it? Arrowhead is imho from Juicer software and they use not the cool, but the hic format. Maybe the data was mapped to a different reference genome, leading to issues especially at the end of a chromosome?

Best,

Joachim

dawe commented 3 years ago

Indeed those regions were found with Juicer on a .hic file. I converted the .hic to .cool, plus the offending lines aren’t at the end of the chromosome. Also, running only on the offending line does not raise the error.

joachimwolff commented 3 years ago

I see. Is that a public dataset? Or is there the chance you provide me with a sample of where this happens? I could investigate in detail what is going on.

dawe commented 3 years ago

I wish I could share these data with you but I don't own them. I guess I'll try to debug them and I'll come back when (and if) I can solve this isse. Just to be clear, hicDifferentialTAD only reads given coordinates in both matrices files, and does not expect anything but coordinates, right? Is there the chance that columns 4,5,6 are used?

dawe commented 3 years ago

I think I understood what's happening here. TAD definitions from juicer/arrowhead include some overlapping entries, such as:

1       39310000        39550000  
1       39320000        39450000  

In which the second interval is included in the first one. So, when hicDifferentialTAD takes the coordinates of the second one to test the inter-rightTAD, the coordinates are out of frame.

joachimwolff commented 3 years ago

@dawe Can you test if it is fixed in the current develop branch?

dawe commented 3 years ago

@joachimwolff Sorry, I was out of office. I'm back now and testing the fix, it seems to be working smoothly.