vaquerizaslab / tadtool

TADtool is an interactive tool for the identification of meaningful parameters in TAD-calling algorithms for Hi-C data.
MIT License
43 stars 13 forks source link

Index out of bounds for axis #17

Closed jennifm closed 6 years ago

jennifm commented 6 years ago

Hello,

I just installed TADtool and ran into an issue when trying to plot my data. The following is the message I get:

2018-06-12 12:20:56,665 INFO Loading regions...
2018-06-12 12:20:58,281 INFO Checking plotting region in matrix...
2018-06-12 12:20:58,281 INFO Loading matrix...
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/2.7/bin/tadtool", line 4, in <module>
    __import__('pkg_resources').run_script('tadtool==0.77', 'tadtool')
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pkg_resources/__init__.py", line 743, in run_script
    self.require(requires)[0].run_script(script_name, ns)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pkg_resources/__init__.py", line 1505, in run_script
    exec(script_code, namespace, namespace)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tadtool-0.77-py2.7.egg/EGG-INFO/scripts/tadtool", line 429, in <module>

  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tadtool-0.77-py2.7.egg/EGG-INFO/scripts/tadtool", line 41, in __init__

  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tadtool-0.77-py2.7.egg/EGG-INFO/scripts/tadtool", line 139, in plot

  File "build/bdist.macosx-10.6-intel/egg/tadtool/tad.py", line 272, in load_matrix
IndexError: index 620000 is out of bounds for axis 1 with size 619146

I found some other issues that were fixed by updating TADtool, but given I just downloaded it I doubt this is an issue.

Thanks in advance for your help!

vaquerizaslab-old commented 6 years ago

Hi there,

could you give us a little bit more information? Specifically:

Hopefully that will help us narrow down your issue. Thanks!

jennifm commented 6 years ago

Hello,

Happy to provide some more information! Thank you so much for your help.

  1. The data I'm using was extracted from a .hic file using the juicer tools dump command (https://github.com/theaidenlab/juicer/wiki/Data-Extraction) and is in a dense matrix format.

  2. The regions file is of the entire chromosome, binned in 5kb chunks. The first ten lines are: chr1 0 5000 chr1 5000 10000 chr1 10000 15000 chr1 15000 20000 chr1 20000 25000 chr1 25000 30000 chr1 30000 35000 chr1 35000 40000 chr1 40000 45000 chr1 45000 50000

and the last ten lines are: chr1 59325000 59330000 chr1 59330000 59335000 chr1 59335000 59340000 chr1 59340000 59345000 chr1 59345000 59350000 chr1 59350000 59355000 chr1 59355000 59360000 chr1 59360000 59365000 chr1 59365000 59370000 chr1 59370000 59373566

  1. The first ten lines of the matrix file are: 10000 10000 1347.0302 10000 15000 2621.1165 15000 15000 2988.0515 10000 20000 270.09296 20000 20000 4765.757 30000 30000 4961.506 10000 45000 32.332844 15000 45000 41.943203 30000 45000 411.61227 35000 45000 NaN

and the last ten lines are: 249190000 249235000 9.788078 249195000 249235000 21.313084 249200000 249235000 24.077797 249205000 249235000 27.383415 249210000 249235000 31.512163 249215000 249235000 34.50513 249220000 249235000 24.16354 249225000 249235000 27.140318 249230000 249235000 146.62665 249235000 249235000 2454.9463

I get a similar error for other chromosomes as well.

Thanks in advance for your help!

vaquerizaslab-old commented 6 years ago

Hi again,

thank you for providing this information. It looks like your input matrix is not in a format TADtool understands. Specifically, if you have a look at the documentation, a sparse matrix has to have this format: a tab delimited file where each line has three columns: <row index> <column index> <value>. It seems like your matrix does not use the matrix index in the first two columns, but the actual region location.

10000   10000   1347.0302
10000   15000   2621.1165
15000   15000   2988.0515 
...

should become

2   2   1347.0302
2   3   2621.1165
3   3   2988.0515 
...

Please let us know if that fixes the problem!

vaquerizaslab-old commented 6 years ago

Closing due to inactivity. We assume our suggestion fixed the issue.