kundajelab / 3DChromatin_ReplicateQC

Software to compute reproducibility and quality scores for Hi-C data
MIT License
43 stars 16 forks source link

ValueError when running 3DChromatin_ReplicateQC concordance #14

Open Spiridempt opened 5 years ago

Spiridempt commented 5 years ago

When running 3DChromatin_ReplicateQC step by step, I came across the following problem.

Both 3DChromatin_ReplicateQC preprocess and 3DChromatin_ReplicateQC qc generate reasonable results. But 3DChromatin_ReplicateQC concordance gives ValueError:

Step: concordance | Fri Nov  1 15:24:48 2019 | computing concordance between Control_Rep1_1000_10kb and Control_Rep2_1000_10kb
GenomeDISCO | Fri Nov  1 15:25:00 2019 | :::::::::: Starting reproducibility analysis
GenomeDISCO | Fri Nov  1 15:25:00 2019 | processing: Loading genomic regions from /data2/usr/liuy/Project/hicQC/hicQCOUTPUT/Control_Rep1VSControl_Rep2.10kb/data/nodes/nodes.chr1.gz
GenomeDISCO | Fri Nov  1 15:25:00 2019 | Loading contact maps
GenomeDISCO | Fri Nov  1 15:25:00 2019 | processing: Loading interaction data from /data2/usr/liuy/Project/hicQC/hicQCOUTPUT/Control_Rep1VSControl_Rep2.10kb/data/edges/Control_Rep1_1000_10kb/Control_Rep1_1000_10kb.chr1.gz
GenomeDISCO | Fri Nov  1 15:25:06 2019 | processing: Loading interaction data from /data2/usr/liuy/Project/hicQC/hicQCOUTPUT/Control_Rep1VSControl_Rep2.10kb/data/edges/Control_Rep2_1000_10kb/Control_Rep2_1000_10kb.chr1.gz
GenomeDISCO | Fri Nov  1 15:25:15 2019 | Subsampling depth = 7517561.0
GenomeDISCO | Fri Nov  1 15:25:25 2019 | Normalizing with sqrtvc
GenomeDISCO | Fri Nov  1 15:25:26 2019 | Distance dependence analysis
GenomeDISCO | Fri Nov  1 15:25:26 2019 | Computing reproducibility score
GenomeDISCO | Fri Nov  1 15:25:27 2019 | done t=1 (not included in score calculation)
GenomeDISCO | Fri Nov  1 15:25:54 2019 | done t=2 (not included in score calculation)
GenomeDISCO | Fri Nov  1 15:27:03 2019 | done t=3 | score=0.938
GenomeDISCO | Fri Nov  1 15:27:09 2019 | :::::::::: Starting reproducibility analysis
GenomeDISCO | Fri Nov  1 15:27:09 2019 | processing: Loading genomic regions from /data2/usr/liuy/Project/hicQC/hicQCOUTPUT/Control_Rep1VSControl_Rep2.10kb/data/nodes/nodes.chr2.gz
GenomeDISCO | Fri Nov  1 15:27:09 2019 | Loading contact maps
GenomeDISCO | Fri Nov  1 15:27:09 2019 | processing: Loading interaction data from /data2/usr/liuy/Project/hicQC/hicQCOUTPUT/Control_Rep1VSControl_Rep2.10kb/data/edges/Control_Rep1_1000_10kb/Control_Rep1_1000_10kb.chr2.gz
GenomeDISCO | Fri Nov  1 15:27:12 2019 | processing: Loading interaction data from /data2/usr/liuy/Project/hicQC/hicQCOUTPUT/Control_Rep1VSControl_Rep2.10kb/data/edges/Control_Rep2_1000_10kb/Control_Rep2_1000_10kb.chr2.gz
GenomeDISCO | Fri Nov  1 15:27:16 2019 | Subsampling depth = 6256585.0
GenomeDISCO | Fri Nov  1 15:27:21 2019 | Normalizing with sqrtvc
GenomeDISCO | Fri Nov  1 15:27:22 2019 | Distance dependence analysis
GenomeDISCO | Fri Nov  1 15:27:22 2019 | Computing reproducibility score
GenomeDISCO | Fri Nov  1 15:27:22 2019 | done t=1 (not included in score calculation)
GenomeDISCO | Fri Nov  1 15:27:33 2019 | done t=2 (not included in score calculation)
GenomeDISCO | Fri Nov  1 15:27:52 2019 | done t=3 | score=0.947
GenomeDISCO | Fri Nov  1 15:27:55 2019 | :::::::::: Starting reproducibility analysis
GenomeDISCO | Fri Nov  1 15:27:55 2019 | processing: Loading genomic regions from /data2/usr/liuy/Project/hicQC/hicQCOUTPUT/Control_Rep1VSControl_Rep2.10kb/data/nodes/nodes.chr3.gz
GenomeDISCO | Fri Nov  1 15:27:55 2019 | Loading contact maps
GenomeDISCO | Fri Nov  1 15:27:55 2019 | processing: Loading interaction data from /data2/usr/liuy/Project/hicQC/hicQCOUTPUT/Control_Rep1VSControl_Rep2.10kb/data/edges/Control_Rep1_1000_10kb/Control_Rep1_1000_10kb.chr3.gz
GenomeDISCO | Fri Nov  1 15:27:59 2019 | processing: Loading interaction data from /data2/usr/liuy/Project/hicQC/hicQCOUTPUT/Control_Rep1VSControl_Rep2.10kb/data/edges/Control_Rep2_1000_10kb/Control_Rep2_1000_10kb.chr3.gz
GenomeDISCO | Fri Nov  1 15:28:05 2019 | Subsampling depth = 5682122.0
GenomeDISCO | Fri Nov  1 15:28:11 2019 | Normalizing with sqrtvc
GenomeDISCO | Fri Nov  1 15:28:12 2019 | Distance dependence analysis
GenomeDISCO | Fri Nov  1 15:28:12 2019 | Computing reproducibility score
GenomeDISCO | Fri Nov  1 15:28:12 2019 | done t=1 (not included in score calculation)
GenomeDISCO | Fri Nov  1 15:28:26 2019 | done t=2 (not included in score calculation)
GenomeDISCO | Fri Nov  1 15:28:57 2019 | done t=3 | score=0.946
GenomeDISCO | Fri Nov  1 15:29:01 2019 | :::::::::: Starting reproducibility analysis
GenomeDISCO | Fri NTraceback (most recent call last):
  File "/data2/usr/liuy/software/3DChromatin_ReplicateQC/wrappers/HiC-Spector/run_reproducibility_v1.py", line 209, in <module>
    main()
  File "/data2/usr/liuy/software/3DChromatin_ReplicateQC/wrappers/HiC-Spector/run_reproducibility_v1.py", line 202, in main
    get_reproducibility(M1,M2,int(sys.argv[6]))
  File "/data2/usr/liuy/software/3DChromatin_ReplicateQC/wrappers/HiC-Spector/run_reproducibility_v1.py", line 157, in get_reproducibility
    a1, b1=eigsh(M1b_L,k=num_evec,which="SM")
  File "/home/liuyi/.python2.7.15/lib/python2.7/site-packages/scipy/sparse/linalg/eigen/arpack/arpack.py", line 1585, in eigsh
    return eigh(A, b=M, eigvals_only=not return_eigenvectors)
  File "/home/liuyi/.python2.7.15/lib/python2.7/site-packages/scipy/linalg/decomp.py", line 432, in eigh
    iu=a1.shape[0], overwrite_a=overwrite_a)
ValueError: failed to create intent(cache|hide)|optional array-- must have defined dimensions but got (0,)
Traceback (most recent call last):
  File "/data2/usr/liuy/software/3DChromatin_ReplicateQC/wrappers/HiC-Spector/run_reproducibility_v1.py", line 209, in <module>
    main()
  File "/data2/usr/liuy/software/3DChromatin_ReplicateQC/wrappers/HiC-Spector/run_reproducibility_v1.py", line 202, in main
    get_reproducibility(M1,M2,int(sys.argv[6]))
  File "/data2/usr/liuy/software/3DChromatin_ReplicateQC/wrappers/HiC-Spector/run_reproducibility_v1.py", line 157, in get_reproducibility
    a1, b1=eigsh(M1b_L,k=num_evec,which="SM")
  File "/home/liuyi/.python2.7.15/lib/python2.7/site-packages/scipy/sparse/linalg/eigen/arpack/arpack.py", line 1585, in eigsh
    return eigh(A, b=M, eigvals_only=not return_eigenvectors)
  File "/home/liuyi/.python2.7.15/lib/python2.7/site-packages/scipy/linalg/decomp.py", line 432, in eigh
    iu=a1.shape[0], overwrite_a=overwrite_a)
ValueError: failed to create intent(cache|hide)|optional array-- must have defined dimensions but got (0,)
Traceback (most recent call last):
  File "/data2/usr/liuy/software/3DChromatin_ReplicateQC/wrappers/HiC-Spector/run_reproducibility_v1.py", line 209, in <module>
    main()
  File "/data2/usr/liuy/software/3DChromatin_ReplicateQC/wrappers/HiC-Spector/run_reproducibility_v1.py", line 202, in main
    get_reproducibility(M1,M2,int(sys.argv[6]))
  File "/data2/usr/liuy/software/3DChromatin_ReplicateQC/wrappers/HiC-Spector/run_reproducibility_v1.py", line 157, in get_reproducibility
    a1, b1=eigsh(M1b_L,k=num_evec,which="SM")
  File "/home/liuyi/.python2.7.15/lib/python2.7/site-packages/scipy/sparse/linalg/eigen/arpack/arpack.py", line 1585, in eigsh
    return eigh(A, b=M, eigvals_only=not return_eigenvectors)
  File "/home/liuyi/.python2.7.15/lib/python2.7/site-packages/scipy/linalg/decomp.py", line 432, in eigh
    iu=a1.shape[0], overwrite_a=overwrite_a)
ValueError: failed to create intent(cache|hide)|optional array-- must have defined dimensions but got (0,)
Traceback (most recent call last):
  File "/data2/usr/liuy/software/3DChromatin_ReplicateQC/wrappers/HiC-Spector/run_reproducibility_v1.py", line 209, in <module>
    main()
  File "/data2/usr/liuy/software/3DChromatin_ReplicateQC/wrappers/HiC-Spector/run_reproducibility_v1.py", line 202, in main
    get_reproducibility(M1,M2,int(sys.argv[6]))
  File "/data2/usr/liuy/software/3DChromatin_ReplicateQC/wrappers/HiC-Spector/run_reproducibility_v1.py", line 157, in get_reproducibility
    a1, b1=eigsh(M1b_L,k=num_evec,which="SM")
  File "/home/liuyi/.python2.7.15/lib/python2.7/site-packages/scipy/sparse/linalg/eigen/arpack/arpack.py", line 1585, in eigsh
    return eigh(A, b=M, eigvals_only=not return_eigenvectors)
  File "/home/liuyi/.python2.7.15/lib/python2.7/site-packages/scipy/linalg/decomp.py", line 432, in eigh
    iu=a1.shape[0], overwrite_a=overwrite_a)
ValueError: failed to create intent(cache|hide)|optional array-- must have defined dimensions but got (0,)
Traceback (most recent call last):
  File "/data2/usr/liuy/software/3DChromatin_ReplicateQC/wrappers/HiC-Spector/run_reproducibility_v1.py", line 209, in <module>
    main()
  File "/data2/usr/liuy/software/3DChromatin_ReplicateQC/wrappers/HiC-Spector/run_reproducibility_v1.py", line 202, in main
    get_reproducibility(M1,M2,int(sys.argv[6]))
  File "/data2/usr/liuy/software/3DChromatin_ReplicateQC/wrappers/HiC-Spector/run_reproducibility_v1.py", line 163, in get_reproducibility
    b1_extend[i_nz1,i]=b1[:,i]
IndexError: index 2 is out of bounds for axis 1 with size 2
Loading required package: hicrep
Loading required package: reshape2
Loading required package: hicrep
Loading required package: reshape2
Loading required package: hicrep
Loading required package: reshape2
Loading required package: hicrep
Loading required package: reshape2
Loading required package: hicrep
Loading required package: reshape2
ov  1 15:29:01 2019 | processing: Loading genomic regions from /data2/usr/liuy/Project/hicQC/hicQCOUTPUT/Control_Rep1VSControl_Rep2.10kb/data/nodes/nodes.chr4.gz
GenomeDISCO | Fri Nov  1 15:29:01 2019 | Loading contact maps
GenomeDISCO | Fri Nov  1 15:29:01 2019 | processing: Loading interaction data from /data2/usr/liuy/Project/hicQC/hicQCOUTPUT/Control_Rep1VSControl_Rep2.10kb/data/edges/Control_Rep1_1000_10kb/Control_Rep1_1000_10kb.chr4.gz
GenomeDISCO | Fri Nov  1 15:29:04 2019 | processing: Loading interaction data from /data2/usr/liuy/Project/hicQC/hicQCOUTPUT/Control_Rep1VSControl_Rep2.10kb/data/edges/Control_Rep2_1000_10kb/Control_Rep2_1000_10kb.chr4.gz
GenomeDISCO | Fri Nov  1 15:29:07 2019 | Subsampling depth = 4125211.0
GenomeDISCO | Fri Nov  1 15:29:12 2019 | Normalizing with sqrtvc
GenomeDISCO | Fri Nov  1 15:29:13 2019 | Distance dependence analysis
GenomeDISCO | Fri Nov  1 15:29:13 2019 | Computing reproducibility score
GenomeDISCO | Fri Nov  1 15:29:13 2019 | done t=1 (not included in score calculation)
GenomeDISCO | Fri Nov  1 15:29:22 2019 | done t=2 (not included in score calculation)
GenomeDISCO | Fri Nov  1 15:29:37 2019 | done t=3 | score=0.945
GenomeDISCO | Fri Nov  1 15:29:40 2019 | :::::::::: Starting reproducibility analysis
GenomeDISCO | Fri Nov  1 15:29:40 2019 | processing: Loading genomic regions from /data2/usr/liuy/Project/hicQC/hicQCOUTPUT/Control_Rep1VSControl_Rep2.10kb/data/nodes/nodes.chr5.gz
GenomeDISCO | Fri Nov  1 15:29:41 2019 | Loading contact maps
GenomeDISCO | Fri Nov  1 15:29:41 2019 | processing: Loading interaction data from /data2/usr/liuy/Project/hicQC/hicQCOUTPUT/Control_Rep1VSControl_Rep2.10kb/data/edges/Control_Rep1_1000_10kb/Control_Rep1_1000_10kb.chr5.gz
GenomeDISCO | Fri Nov  1 15:29:45 2019 | processing: Loading interaction data from /data2/usr/liuy/Project/hicQC/hicQCOUTPUT/Control_Rep1VSControl_Rep2.10kb/data/edges/Control_Rep2_1000_10kb/Control_Rep2_1000_10kb.chr5.gz
GenomeDISCO | Fri Nov  1 15:29:53 2019 | Subsampling depth = 6519023.0
GenomeDISCO | Fri Nov  1 15:30:01 2019 | Normalizing with sqrtvc
GenomeDISCO | Fri Nov  1 15:30:01 2019 | Distance dependence analysis
GenomeDISCO | Fri Nov  1 15:30:01 2019 | Computing reproducibility score
GenomeDISCO | Fri Nov  1 15:30:02 2019 | done t=1 (not included in score calculation)
GenomeDISCO | Fri Nov  1 15:30:21 2019 | done t=2 (not included in score calculation)
GenomeDISCO | Fri Nov  1 15:31:06 2019 | done t=3 | score=0.942

I only specified --metadata_pairs and --outdir options in this command, and it semms that my data is somewhat special and triggered errors in scipy.

Do you know how to deal with such problem? Thanks!

chesi commented 4 years ago

Hi, I have the same issue - would appreciate any insight. Thanks

nponts commented 1 year ago

Hi, Did you ever get to a solution to solve this problem? Got it too... Thank you!

N.