vaquerizaslab / tadtool

TADtool is an interactive tool for the identification of meaningful parameters in TAD-calling algorithms for Hi-C data.
MIT License
43 stars 13 forks source link

Unable to run example data on Ubuntu #3

Closed mthimma closed 8 years ago

mthimma commented 8 years ago

Hi I would like to use your tool to plot TADs in our sample.

In order to do that, I installed everything on my Ubuntu work station 12.04.

When I try with example data provided by you, it throws an error.

chr12_20-35Mb.matrix.txt -rw-r--r-- 1 thimmamp kw-users 14424 Aug 18 17:11 chr12_20-35Mb_regions.bed -rwxr--r-- 1 thimmamp kw-users 443 Aug 18 17:17 chr12_plot.py drwxr-xr-x 2 thimmamp kw-users 4096 Aug 18 17:17 . thimmamp@kw12556:~/RNAiHiCAnalysis/sampledataTADtool$ ./chr12_plot.py Traceback (most recent call last): File "./chr12_plot.py", line 3, in import tadtool.plot as tp File "/usr/local/lib/python2.7/dist-packages/tadtool/plot.py", line 8, in import matplotlib.pyplot as plt File "/usr/local/lib/python2.7/dist-packages/matplotlib/pyplot.py", line 114, in _backend_mod, new_figure_manager, draw_if_interactive, _show = pylab_setup() File "/usr/local/lib/python2.7/dist-packages/matplotlib/backends/init.py", line 32, in pylab_setup globals(),locals(),[backend_name],0) File "/usr/local/lib/python2.7/dist-packages/matplotlib/backends/backend_tkagg.py", line 13, in import matplotlib.backends.tkagg as tkagg File "/usr/local/lib/python2.7/dist-packages/matplotlib/backends/tkagg.py", line 9, in from matplotlib.backends import _tkagg ImportError: cannot import name _tkagg

I tried updating tk and tk-dev packages and reinstalling matplotlib as well. But the problem still continues...

Do you suggest any work around for this?

vaquerizaslab-old commented 8 years ago

This appears to be a known matplotlib issue on Ubuntu. Have you followed the different suggestions here?

http://stackoverflow.com/questions/32188180/from-matplotlib-backends-import-tkagg-importerror-cannot-import-name-tkagg

Otherwise I'm afraid I can't be of any more help, since this does not appear to be a TADtool problem.

mthimma commented 8 years ago

I moved the testing to Mac and seems this tk issue is fixed.

But there seems to be another issue with numpy.

python chr12_plot.py 20% ( 20 of 100) |############################# | Elapsed Time: 0:00:00 ETA: 0:00:03Traceback (most recent call last): File "chr12_plot.py", line 12, in tad_plot = tp.TADtoolPlot(matrix, regions, norm='lin', max_dist=1000000, algorithm='insulation') File "/Library/Python/2.7/site-packages/tadtool/plot.py", line 536, in init tad_method=self.tad_algorithm, window_sizes=window_sizes) File "/Library/Python/2.7/site-packages/tadtool/tad.py", line 646, in data_array da.append(tad_method(hic_matrix, regions, window_size, **kwargs)) File "/Library/Python/2.7/site-packages/tadtool/tad.py", line 604, in insulation_index ins_matrix = np.array(list(itertools.chain.from_iterable(ins_by_chromosome))) ValueError: setting an array element with a sequence.

vaquerizaslab-old commented 8 years ago

Without seeing the complete code that you use to call TADtoolPlot, I cannot debug your issue. I suspect, however, that there is a mismatch between the type of parameters you are passing to the class and the type it expects.

Did you have a look at the code in the executable to see how it is intended to be used?

mthimma commented 8 years ago

Sorry...

Here is the code I used.

!/usr/bin/python

import tadtool.tad as tad import tadtool.plot as tp

load regions data set

regions = tad.HicRegionFileReader().regions("chr12_20-35Mb_regions.bed")

load matrix

matrix = tad.HicMatrixFileReader().matrix("chr12_20-35Mb.matrix.txt")

prepare plot

tad_plot = tp.TADtoolPlot(matrix, regions, norm='lin', max_dist=1000000, algorithm='insulation') fig, axes = tad_plot.plot('chr12:31000000-34000000')

show plot

fig.show()

vaquerizaslab-old commented 8 years ago

Thanks - I just copy-pasted your code into my Python 2.7.10 console and it worked without any issues. Can you provide me with your Python and Numpy versions? I am running numpy version 1.10.1 on this machine.

mthimma commented 8 years ago

Here it is.

python Python 2.7.5 (default, Mar 9 2014, 22:15:05) [GCC 4.2.1 Compatible Apple LLVM 5.0 (clang-500.0.68)] on darwin Type "help", "copyright", "credits" or "license" for more information.

import numpy numpy.version '1.6.2'

vaquerizaslab-old commented 8 years ago

Thanks. I am also on a Mac (albeit with a Homebrew Python installation). Try upgrading your Python to 2.7.10 or higher and your Numpy to the latest compatible version (ideally >=1.10.1). Also make sure you are running the latest TADtool version (0.61). If that fixes things, I will specify minimum version requirements in the installation file, so other users don't run into the same issue.

mthimma commented 8 years ago

I am about to download and install Python 2.7.12 for Mac, Will that cause any harm to TADtool?

vaquerizaslab-old commented 8 years ago

If anything, it should improve the matter

mthimma commented 8 years ago

Hi Python version is upgraded to 2.7.12 and relevant modules are installed.

When I tried with example data, it still fails...

tadtool plot chr12_20-35Mb.matrix.txt chr12_20-35Mb_regions.bed chr12:31000000-33000000 Traceback (most recent call last): File "/usr/local/bin/tadtool", line 280, in TADtool() File "/usr/local/bin/tadtool", line 40, in init getattr(self, args.command)() File "/usr/local/bin/tadtool", line 115, in plot import tadtool.tad as tad File "/Library/Python/2.7/site-packages/tadtool/tad.py", line 493, in aggr_func=scipy.stats.nanmean, impute_missing=False, normalize=False, AttributeError: 'module' object has no attribute 'stats' kl-10415:sampledataTADtool thimmamp$ python Python 2.7.12 (v2.7.12:d33e0cf91556, Jun 26 2016, 12:10:39) [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin Type "help", "copyright", "credits" or "license" for more information.

import scipy scipy.version '0.18.0'

vaquerizaslab-old commented 8 years ago

The line 493 mentioned in the error message:

aggr_func=scipy.stats.nanmean, impute_missing=False, normalize=False,

is not originally in the code. Either this is an old version - but I do not remember ever using scipy for this particular project and can't find it in the commit history - or the file was modified. You can see the original line here and you will see that it uses Numpy.

Please install the latest, unmodified version via pip

pip install --upgrade tadtool

I cannot support modified code.

vaquerizaslab-old commented 8 years ago

Due to inactivity, I am assuming this has fixed the issue. Let me know if there are further problems.

mthimma commented 8 years ago

Sorry for the delay in getting back to you... I upgraded the tadtool and rerun the example code. python Python 2.7.12 (v2.7.12:d33e0cf91556, Jun 26 2016, 12:10:39) [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin Type "help", "copyright", "credits" or "license" for more information.

import tadtool.tad as tad import tadtool.plot as tp regions = tad.HicRegionFileReader().regions("chr12_20-35Mb_regions.bed") matrix = tad.HicMatrixFileReader().matrix("chr12_20-35Mb.matrix.txt") tad_plot = tp.TADtoolPlot(matrix, regions, norm='lin', max_dist=1000000, algorithm='insulation') 100% (100 of 100) |####################################################################################################################################################| Elapsed Time: 0:00:04 Time: 0:00:04 fig, axes = tad_plot.plot('chr12:31000000-34000000') fig.show() But I am unable to see output after running the code. All I could see is an empty screen with python IDE logo!

screen shot 2016-09-25 at 10 43 46 am

Would you please help me sort this out?

vaquerizaslab-old commented 8 years ago

Please try running the same code in a Terminal and not within an IDE. On OS X open the "Terminal" application, type 'python', hit Enter, and paste in the above code. Unless your system's Python installation or configuration are broken, that should open the TADtool window.

mthimma commented 8 years ago

Thanks for the reply.

I did as shown below.

cat chr12_plot.py

!/usr/bin/python

import tadtool.tad as tad import tadtool.plot as tp

load regions data set

regions = tad.HicRegionFileReader().regions("chr12_20-35Mb_regions.bed")

load matrix

matrix = tad.HicMatrixFileReader().matrix("chr12_20-35Mb.matrix.txt")

prepare plot

tad_plot = tp.TADtoolPlot(matrix, regions, norm='lin', max_dist=1000000, algorithm='insulation') fig, axes = tad_plot.plot('chr12:31000000-34000000')

show plot

fig.show() kl-10415:sampledataTADtool thimmamp$ python chr12_plot.py 100% (100 of 100) |####################################################################################################################################################| Elapsed Time: 0:00:04 Time: 0:00:04

But I do not see TADtool window being opened.

vaquerizaslab-old commented 8 years ago

You can try one of two things:

  1. Do as I instructed above and paste the code into the Python console directly
  2. Replace the last line (fig.show()) with
import matplotlib.pyplot as plt
plt.show()

Explanation: fig.show() does not block the main thread progression. When run from a script, the plotting window may briefly flash and immediately be closed, because the script has reached its end. plt.show() blocks until the plot is dismissed (see here).

If that doesn't help, check out this post.

mthimma commented 8 years ago

Hi,

I am doing the same,

TADtool thimmamp$ python Python 2.7.12 (v2.7.12:d33e0cf91556, Jun 26 2016, 12:10:39) [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin Type "help", "copyright", "credits" or "license" for more information.

import tadtool.tad as tad import tadtool.plot as tp import matplotlib.pyplot as plt regions = tad.HicRegionFileReader().regions("chr12_20-35Mb_regions.bed") matrix = tad.HicMatrixFileReader().matrix("chr12_20-35Mb.matrix.txt") tad_plot = tp.TADtoolPlot(matrix, regions, norm='lin', max_dist=1000000, algorithm='insulation') 100% (100 of 100) |####################################################################################################################################################| Elapsed Time: 0:00:04 Time: 0:00:04 fig, axes = tad_plot.plot('chr12:31000000-34000000') plt.show()

But no plot is shown!!.

BTW I am using Mac OSX.

vaquerizaslab-old commented 8 years ago

Can you please try plotting to file to narrow down the issue? So instead of plt.show(), do plt.savefig('test.pdf'), and let me know if you have any output in test.pdf.

mthimma commented 8 years ago

yes! it worked!

screen shot 2016-09-27 at 10 51 25 am

BTw I have matrices generated from HiCPro tool for our samples.

Is there any help file or thread which discusses about using HiCPro output to get TADs using your tool?

vaquerizaslab-old commented 8 years ago

The fact that you are seeing PDF output probably means that your Matplotlib backend is not configured properly for interactive plotting.

In a python console, what is the output of

import matplotlib
matplotlib.get_backend()

?

To solve the interactive plotting issue, try this at the very top of your code:

import matplotlib
matplotlib.use('TkAgg')

followed by the code you had above.

Regarding HiCPro: I am not a regular user of HiCPro and don't have any pointers on how to convert their output to a txt file or numpy matrix. You should probably ask the authors of that package directly.

mthimma commented 8 years ago

import matplotlib matplotlib.get_backend() u'MacOSX'

After including matplotlib.use('TkAgg'), I could see the plot on the screen!

Thanks very much!

I will check with the HiCPro team to convert the output compatible for TADtool.