Closed aj03 closed 7 years ago
I have tested HiCExplorer with pysam 0.8.3
I will check what is the problem with pysam 0.9.1.4
On Thu, Nov 10, 2016 at 5:05 AM, aj03 notifications@github.com wrote:
I am facing this problem when running HiCExplorer software:
command: hicBuildMatrix -s mapping/SRR1956527_1.bam mapping/SRR1956527_2.bam -rs dpnII_positions_GRCm38.bed -seq GATC -b hiCmatrix/SRR1956527_ref.bam -o hiCmatrix/SRR1956527.matrix
reading mapping/SRR1956527_1.bam and mapping/SRR1956527_2.bam to build hic_matrix Minimum distance considered between restriction sites is 300 Max distance: 800 Matrix size: 2666241 dangling sequences to check are {'pat_forw': 'ATC', 'pat_rev': 'GAT'} Traceback (most recent call last): File "/usr/local/bin/hicBuildMatrix", line 7, in main() File "/usr/local/lib/python2.7/dist-packages/hicexplorer/hicBuildMatrix.py", line 644, in main mate1_supplementary_list = get_supplementary_alignment(mate1, str1) File "/usr/local/lib/python2.7/dist-packages/hicexplorer/hicBuildMatrix.py", line 410, in get_supplementary_alignment if read.has_tag('SA'): AttributeError: 'csamtools.AlignedRead' object has no attribute 'has_tag'
pysam version: Python 2.7.6 (default, Jun 22 2015, 17:58:13) [GCC 4.8.2] on linux2 Type "help", "copyright", "credits" or "license" for more information. import pkg_resources pkg_resources.get_distribution("pysam").version '0.9.1.4'
Can you please help me to solve it out. Thanks in advance
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/maxplanck-ie/HiCExplorer/issues/42, or mute the thread https://github.com/notifications/unsubscribe-auth/AEu_1Q2IFpgnSwkH04FDaZgtqyoKh70Rks5q8pf8gaJpZM4KuQoj .
Fidel Ramirez
I just checked and the has_tag
attribute was added in pysam version 0.8.2
I tested the code with pysam 0.9.1.4 and didn't have any trouble. I added commit 9ad2a3efabb6207991d283f08b2edb9f163b9336 to test for the pysam version. This will inform you the pysam version being used by the code.
I would guess that in your case you have more than one pysam version installed and hicexplorer is using and olded one.
Try to run the code as /path/to/python hicBuildMatrix -s .....
Thanks fidelram.. It worked :)
Actually the problem was that I had more than one pysam version installed (0.9.1.4 and 0.6) and hicexplorer was using and olded one (0.6).... hicBuildMatrix -s ../mapping/SRR1956527_1.bam ../mapping/SRR1956527_2.bam -rs ../dpnII_positions_GRCm38.bed -seq GATC -b SRR1956527_ref.bam -o SRR1956527.matrix ERROR Version of pysam has to be higher than 0.8.3. Current installed version is 0.6
The has_tag pysam error is resolved now.
Thanks once again :)
I think that this function was added to scipy recently. I am using scipy 0.17. Can you update your scipy version?
Meanwhile I will update the versions in the setup.py and add further version tests to avoid this issue in the future.
On Thu, Nov 10, 2016 at 12:04 PM, aj03 notifications@github.com wrote:
Thanks fidelram.. It worked :)
Actually the problem was that I had more than one pysam version installed (0.9.1.4 and 0.6) and hicexplorer was using and olded one (0.6).... hicBuildMatrix -s ../mapping/SRR1956527_1.bam ../mapping/SRR1956527_2.bam -rs ../dpnII_positions_GRCm38.bed -seq GATC -b SRR1956527_ref.bam -o SRR1956527.matrix ERROR Version of pysam has to be higher than 0.8.3. Current installed version is 0.6
The has_tag pysam error is resolved now but its showing new error now:
hicBuildMatrix -s ../mapping/SRR1956527_1.bam ../mapping/SRR1956527_2.bam -rs ../dpnII_positions_GRCm38.bed -seq GATC -b SRR1956527_ref.bam -o SRR1956527.matrix
reading ../mapping/SRR1956527_1.bam and ../mapping/SRR1956527_2.bam to build hic_matrix Minimum distance considered between restriction sites is 300 Max distance: 800 Matrix size: 2666241 dangling sequences to check are {'pat_forw': 'ATC', 'pat_rev': 'GAT'} processing 1000000 lines took 26.08 secs (38347.3 lines per second) 244810 (24.48%) valid pairs added to matrix processing 2000000 lines took 52.42 secs (38154.5 lines per second) 481367 (24.07%) valid pairs added to matrix processing 3000000 lines took 79.80 secs (37592.9 lines per second) 712799 (23.76%) valid pairs added to matrix processing 4000000 lines took 104.52 secs (38270.7 lines per second) 942385 (23.56%) valid pairs added to matrix processing 5000000 lines took 129.21 secs (38695.9 lines per second) 1176656 (23.53%) valid pairs added to matrix processing 6000000 lines took 155.77 secs (38517.8 lines per second) 1410029 (23.50%) valid pairs added to matrix processing 7000000 lines took 180.14 secs (38858.2 lines per second) 1645092 (23.50%) valid pairs added to matrix processing 8000000 lines took 204.88 secs (39047.4 lines per second) 1889359 (23.62%) valid pairs added to matrix processing 9000000 lines took 230.03 secs (39125.0 lines per second) 2132262 (23.69%) valid pairs added to matrix processing 10000000 lines took 254.96 secs (39221.8 lines per second) 2373752 (23.74%) valid pairs added to matrix Traceback (most recent call last): File "/usr/local/bin/hicBuildMatrix", line 7, in main() File "/usr/local/lib/python2.7/dist-packages/hicexplorer/hicBuildMatrix.py", line 881, in main hic_matrix += coo_matrix((data, (row, col)), shape=(matrix_size, matrix_size)) File "/usr/lib/python2.7/dist-packages/scipy/sparse/base.py", line 387, in iadd raise NotImplementedError NotImplementedError
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/maxplanck-ie/HiCExplorer/issues/42#issuecomment-259662587, or mute the thread https://github.com/notifications/unsubscribe-auth/AEu_1ZzbdQpt4HEmBz7nkXoM-zHaSQB-ks5q8voigaJpZM4KuQoj .
Fidel Ramirez
@aj03 Did it work after updating scipy?
Hi..I updated scipy version and even setup.py but ended up with this error:
processing 47000000 lines took 1442.25 secs (32587.9 lines per second)
10874271 (23.14%) valid pairs added to matrix
processing 48000000 lines took 1471.86 secs (32611.8 lines per second)
11105341 (23.14%) valid pairs added to matrix
Traceback (most recent call last):
File "/usr/local/bin/hicBuildMatrix", line 5, in
I tried to solve it out but...wasn't able to..
scipy version:
pkg_resources.get_distribution("scipy").version '0.18.1'
I am afraid that you may be running out of memory. Processing of Hi-C data unfortunately requires quite some memory and a 64-bit system.
The particular part that is failing for you is the detection of duplicated reads.
I could think of several ways to save memory but none seems optimal:
How much memory do you have? do you have a 64-bit system?
yes OS is 64bit and memory is 19.5GB. disk is 309.5GB my working directory has 134G available space
Seems like enough memory, although we normally work with 300 Gb.
I will add a branch with the option to skip the check for duplicated reads, but you may want to run hicBuildMatrix with the option --doTestRun
that will give you a glimpse of the duplication rate. This option only considers 1 million reads, makes a matrix but importantly, reports the QC values.
I added the branch: https://github.com/maxplanck-ie/HiCExplorer/tree/skip_duplication_check
you need to run hicBuildMatrix with the option --skipDuplicationCheck
Recently, I added snakemake rules to download and process Hi-C data. This workflow may be helpful in your case (but add the --skipDuplicationCheck to the rules). You can find the rules in the /scripts folder (https://github.com/maxplanck-ie/HiCExplorer/tree/master/scripts).
I am facing this problem when running HiCExplorer software:
command: hicBuildMatrix -s mapping/SRR1956527_1.bam mapping/SRR1956527_2.bam -rs dpnII_positions_GRCm38.bed -seq GATC -b hiCmatrix/SRR1956527_ref.bam -o hiCmatrix/SRR1956527.matrix
reading mapping/SRR1956527_1.bam and mapping/SRR1956527_2.bam to build hic_matrix Minimum distance considered between restriction sites is 300 Max distance: 800 Matrix size: 2666241 dangling sequences to check are {'pat_forw': 'ATC', 'pat_rev': 'GAT'} Traceback (most recent call last): File "/usr/local/bin/hicBuildMatrix", line 7, in main() File "/usr/local/lib/python2.7/dist-packages/hicexplorer/hicBuildMatrix.py", line 644, in main mate1_supplementary_list = get_supplementary_alignment(mate1, str1) File "/usr/local/lib/python2.7/dist-packages/hicexplorer/hicBuildMatrix.py", line 410, in get_supplementary_alignment if read.has_tag('SA'): AttributeError: 'csamtools.AlignedRead' object has no attribute 'has_tag'
pysam version: Python 2.7.6 (default, Jun 22 2015, 17:58:13) [GCC 4.8.2] on linux2 Type "help", "copyright", "credits" or "license" for more information. import pkg_resources pkg_resources.get_distribution("pysam").version '0.9.1.4'
Can you please help me to solve it out. Thanks in advance