kundajelab / 3DChromatin_ReplicateQC

Software to compute reproducibility and quality scores for Hi-C data
MIT License
43 stars 16 forks source link

KeyError: "Unable to open object (Object 'dist.1.1000000' doesn't exist)" #3

Closed jstansfield0 closed 6 years ago

jstansfield0 commented 7 years ago

I was able to install the program and ran it on the example data and it worked fined. Now I am trying to run it on my own Hi-C data. I am getting the following output and errors when I try to run the script:

john@john-VirtualBox:~/3DChromatin_ReplicateQC$ python 3DChromatin_ReplicateQC.py run_all --metadata_samples brain/metadata.samples --metadata_pairs brain/metadata.pairs --bins brain/dplfc_1000000_abs.bed.gz --outdir output
4161718 validly-mapped reads pairs loaded.        
4161718 total validly-mapped read pairs loaded. 325 valid fend pairs
Parsing fend pairs... Done  221904 cis reads, 3939814 trans reads
Filtering fends... Removed 3078 of 3103 bins
Traceback (most recent call last):                                                                                      
  File "/home/john/3DChromatin_ReplicateQC/software/genomedisco/reproducibility_analysis/plot_quasar_transform.py", line 68, in <module>
    main()
  File "/home/john/3DChromatin_ReplicateQC/software/genomedisco/reproducibility_analysis/plot_quasar_transform.py", line 47, in main
    data1 = load_data(infile1, chroms, resolutions)
  File "/home/john/3DChromatin_ReplicateQC/software/genomedisco/reproducibility_analysis/plot_quasar_transform.py", line 24, in load_data
    dist = infile['dist.%s.%i' % (chrom, res)][...]
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "/home/john/anaconda2/lib/python2.7/site-packages/h5py/_hl/group.py", line 169, in __getitem__
    oid = h5o.open(self.id, self._e(name), lapl=self._lapl)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5o.pyx", line 190, in h5py.h5o.open
KeyError: "Unable to open object (Object 'dist.1.1000000' doesn't exist)"
4235673 validly-mapped reads pairs loaded.        
4235673 total validly-mapped read pairs loaded. 325 valid fend pairs
Parsing fend pairs... Done  222852 cis reads, 4012821 trans reads
Filtering fends... Removed 3078 of 3103 bins
Traceback (most recent call last):                                                                                      
  File "/home/john/3DChromatin_ReplicateQC/software/genomedisco/reproducibility_analysis/plot_quasar_transform.py", line 68, in <module>
    main()
  File "/home/john/3DChromatin_ReplicateQC/software/genomedisco/reproducibility_analysis/plot_quasar_transform.py", line 47, in main
    data1 = load_data(infile1, chroms, resolutions)
  File "/home/john/3DChromatin_ReplicateQC/software/genomedisco/reproducibility_analysis/plot_quasar_transform.py", line 24, in load_data
    dist = infile['dist.%s.%i' % (chrom, res)][...]
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "/home/john/anaconda2/lib/python2.7/site-packages/h5py/_hl/group.py", line 169, in __getitem__
    oid = h5o.open(self.id, self._e(name), lapl=self._lapl)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5o.pyx", line 190, in h5py.h5o.open
KeyError: "Unable to open object (Object 'dist.1.1000000' doesn't exist)"
3DChromatin_ReplicateQC | Tue Oct 10 13:15:56 2017 | Splitting nodes chr1
3DChromatin_ReplicateQC | Tue Oct 10 13:15:56 2017 | Splitting dplfc_1_1mb.txt chr1
3DChromatin_ReplicateQC | Tue Oct 10 13:16:01 2017 | Splitting dplfc_2_1mb.txt chr1
3DChromatin_ReplicateQC | Tue Oct 10 13:16:07 2017 | Splitting nodes chr10
3DChromatin_ReplicateQC | Tue Oct 10 13:16:07 2017 | Splitting dplfc_1_1mb.txt chr10
3DChromatin_ReplicateQC | Tue Oct 10 13:16:12 2017 | Splitting dplfc_2_1mb.txt chr10
3DChromatin_ReplicateQC | Tue Oct 10 13:16:17 2017 | Splitting nodes chr11
3DChromatin_ReplicateQC | Tue Oct 10 13:16:17 2017 | Splitting dplfc_1_1mb.txt chr11
3DChromatin_ReplicateQC | Tue Oct 10 13:16:22 2017 | Splitting dplfc_2_1mb.txt chr11
3DChromatin_ReplicateQC | Tue Oct 10 13:16:27 2017 | Splitting nodes chr12
3DChromatin_ReplicateQC | Tue Oct 10 13:16:27 2017 | Splitting dplfc_1_1mb.txt chr12
3DChromatin_ReplicateQC | Tue Oct 10 13:16:32 2017 | Splitting dplfc_2_1mb.txt chr12
3DChromatin_ReplicateQC | Tue Oct 10 13:16:37 2017 | Splitting nodes chr13
3DChromatin_ReplicateQC | Tue Oct 10 13:16:37 2017 | Splitting dplfc_1_1mb.txt chr13
3DChromatin_ReplicateQC | Tue Oct 10 13:16:43 2017 | Splitting dplfc_2_1mb.txt chr13
3DChromatin_ReplicateQC | Tue Oct 10 13:16:48 2017 | Splitting nodes chr14
3DChromatin_ReplicateQC | Tue Oct 10 13:16:48 2017 | Splitting dplfc_1_1mb.txt chr14
3DChromatin_ReplicateQC | Tue Oct 10 13:16:53 2017 | Splitting dplfc_2_1mb.txt chr14
3DChromatin_ReplicateQC | Tue Oct 10 13:16:58 2017 | Splitting nodes chr15
3DChromatin_ReplicateQC | Tue Oct 10 13:16:58 2017 | Splitting dplfc_1_1mb.txt chr15
3DChromatin_ReplicateQC | Tue Oct 10 13:17:03 2017 | Splitting dplfc_2_1mb.txt chr15
3DChromatin_ReplicateQC | Tue Oct 10 13:17:08 2017 | Splitting nodes chr16
3DChromatin_ReplicateQC | Tue Oct 10 13:17:08 2017 | Splitting dplfc_1_1mb.txt chr16
3DChromatin_ReplicateQC | Tue Oct 10 13:17:14 2017 | Splitting dplfc_2_1mb.txt chr16
3DChromatin_ReplicateQC | Tue Oct 10 13:17:19 2017 | Splitting nodes chr17
3DChromatin_ReplicateQC | Tue Oct 10 13:17:19 2017 | Splitting dplfc_1_1mb.txt chr17
3DChromatin_ReplicateQC | Tue Oct 10 13:17:24 2017 | Splitting dplfc_2_1mb.txt chr17
3DChromatin_ReplicateQC | Tue Oct 10 13:17:30 2017 | Splitting nodes chr18
3DChromatin_ReplicateQC | Tue Oct 10 13:17:30 2017 | Splitting dplfc_1_1mb.txt chr18
3DChromatin_ReplicateQC | Tue Oct 10 13:17:34 2017 | Splitting dplfc_2_1mb.txt chr18
3DChromatin_ReplicateQC | Tue Oct 10 13:17:40 2017 | Splitting nodes chr19
3DChromatin_ReplicateQC | Tue Oct 10 13:17:40 2017 | Splitting dplfc_1_1mb.txt chr19
3DChromatin_ReplicateQC | Tue Oct 10 13:17:45 2017 | Splitting dplfc_2_1mb.txt chr19
3DChromatin_ReplicateQC | Tue Oct 10 13:17:50 2017 | Splitting nodes chr2
3DChromatin_ReplicateQC | Tue Oct 10 13:17:50 2017 | Splitting dplfc_1_1mb.txt chr2
3DChromatin_ReplicateQC | Tue Oct 10 13:17:55 2017 | Splitting dplfc_2_1mb.txt chr2
3DChromatin_ReplicateQC | Tue Oct 10 13:18:01 2017 | Splitting nodes chr20
3DChromatin_ReplicateQC | Tue Oct 10 13:18:01 2017 | Splitting dplfc_1_1mb.txt chr20
3DChromatin_ReplicateQC | Tue Oct 10 13:18:05 2017 | Splitting dplfc_2_1mb.txt chr20
3DChromatin_ReplicateQC | Tue Oct 10 13:18:11 2017 | Splitting nodes chr21
3DChromatin_ReplicateQC | Tue Oct 10 13:18:11 2017 | Splitting dplfc_1_1mb.txt chr21
3DChromatin_ReplicateQC | Tue Oct 10 13:18:16 2017 | Splitting dplfc_2_1mb.txt chr21
3DChromatin_ReplicateQC | Tue Oct 10 13:18:21 2017 | Splitting nodes chr22
3DChromatin_ReplicateQC | Tue Oct 10 13:18:21 2017 | Splitting dplfc_1_1mb.txt chr22
3DChromatin_ReplicateQC | Tue Oct 10 13:18:26 2017 | Splitting dplfc_2_1mb.txt chr22
3DChromatin_ReplicateQC | Tue Oct 10 13:18:31 2017 | Splitting nodes chr3
3DChromatin_ReplicateQC | Tue Oct 10 13:18:31 2017 | Splitting dplfc_1_1mb.txt chr3
3DChromatin_ReplicateQC | Tue Oct 10 13:18:36 2017 | Splitting dplfc_2_1mb.txt chr3
3DChromatin_ReplicateQC | Tue Oct 10 13:18:42 2017 | Splitting nodes chr4
3DChromatin_ReplicateQC | Tue Oct 10 13:18:42 2017 | Splitting dplfc_1_1mb.txt chr4
3DChromatin_ReplicateQC | Tue Oct 10 13:18:47 2017 | Splitting dplfc_2_1mb.txt chr4
3DChromatin_ReplicateQC | Tue Oct 10 13:18:52 2017 | Splitting nodes chr5
3DChromatin_ReplicateQC | Tue Oct 10 13:18:52 2017 | Splitting dplfc_1_1mb.txt chr5
3DChromatin_ReplicateQC | Tue Oct 10 13:18:57 2017 | Splitting dplfc_2_1mb.txt chr5
3DChromatin_ReplicateQC | Tue Oct 10 13:19:03 2017 | Splitting nodes chr6
3DChromatin_ReplicateQC | Tue Oct 10 13:19:03 2017 | Splitting dplfc_1_1mb.txt chr6
3DChromatin_ReplicateQC | Tue Oct 10 13:19:08 2017 | Splitting dplfc_2_1mb.txt chr6
3DChromatin_ReplicateQC | Tue Oct 10 13:19:14 2017 | Splitting nodes chr7
3DChromatin_ReplicateQC | Tue Oct 10 13:19:14 2017 | Splitting dplfc_1_1mb.txt chr7
3DChromatin_ReplicateQC | Tue Oct 10 13:19:19 2017 | Splitting dplfc_2_1mb.txt chr7
3DChromatin_ReplicateQC | Tue Oct 10 13:19:25 2017 | Splitting nodes chr8
3DChromatin_ReplicateQC | Tue Oct 10 13:19:25 2017 | Splitting dplfc_1_1mb.txt chr8
3DChromatin_ReplicateQC | Tue Oct 10 13:19:29 2017 | Splitting dplfc_2_1mb.txt chr8
3DChromatin_ReplicateQC | Tue Oct 10 13:19:37 2017 | Splitting nodes chr9
3DChromatin_ReplicateQC | Tue Oct 10 13:19:37 2017 | Splitting dplfc_1_1mb.txt chr9
3DChromatin_ReplicateQC | Tue Oct 10 13:19:42 2017 | Splitting dplfc_2_1mb.txt chr9
3DChromatin_ReplicateQC | Tue Oct 10 13:19:48 2017 | Splitting nodes chrM
3DChromatin_ReplicateQC | Tue Oct 10 13:19:48 2017 | Splitting dplfc_1_1mb.txt chrM
3DChromatin_ReplicateQC | Tue Oct 10 13:19:53 2017 | Splitting dplfc_2_1mb.txt chrM
3DChromatin_ReplicateQC | Tue Oct 10 13:19:58 2017 | Splitting nodes chrX
3DChromatin_ReplicateQC | Tue Oct 10 13:19:58 2017 | Splitting dplfc_1_1mb.txt chrX
3DChromatin_ReplicateQC | Tue Oct 10 13:20:03 2017 | Splitting dplfc_2_1mb.txt chrX
3DChromatin_ReplicateQC | Tue Oct 10 13:20:08 2017 | Splitting nodes chrY
3DChromatin_ReplicateQC | Tue Oct 10 13:20:08 2017 | Splitting dplfc_1_1mb.txt chrY
3DChromatin_ReplicateQC | Tue Oct 10 13:20:14 2017 | Splitting dplfc_2_1mb.txt chrY
/home/john/3DChromatin_ReplicateQC/software/hifive/bin/find_quasar_quality_score:68: RuntimeWarning: invalid value encountered in double_scalars
  results[i, -1] = temp[0] / temp[1] - temp[2] / temp[3]
Traceback (most recent call last):
  File "/home/john/3DChromatin_ReplicateQC/software/genomedisco/reproducibility_analysis/quasar_split_by_chromosomes_qc.py", line 27, in <module>
    main()
  File "/home/john/3DChromatin_ReplicateQC/software/genomedisco/reproducibility_analysis/quasar_split_by_chromosomes_qc.py", line 23, in main
    outfile.write(samplename+'\t'+scorelist[d[chromo]]+'\n')
IndexError: list index out of range
Traceback (most recent call last):
  File "3DChromatin_ReplicateQC.py", line 712, in <module>
    main()
  File "3DChromatin_ReplicateQC.py", line 708, in main
    command_methods[command](**args)
  File "3DChromatin_ReplicateQC.py", line 687, in run_all
    get_qc(metadata_samples,methods,parameters_file,outdir,running_mode,concise_analysis,subset_chromosomes)
  File "3DChromatin_ReplicateQC.py", line 390, in get_qc
    quasar_qc_wrapper(outdir,None,samplename,running_mode)
  File "3DChromatin_ReplicateQC.py", line 279, in quasar_qc_wrapper
    run_script(script_comparison_file,running_mode)
  File "3DChromatin_ReplicateQC.py", line 222, in run_script
    output=subp.check_output(['bash','-c',script_name])
  File "/home/john/anaconda2/lib/python2.7/subprocess.py", line 219, in check_output
    raise CalledProcessError(retcode, cmd, output=output)
subprocess.CalledProcessError: Command '['bash', '-c', 'output/scripts/QuASAR-QC/dplfc_1_1mb.txt/dplfc_1_1mb.txt.QuASAR-QC.sh']' returned non-zero exit status 1

I think I have all the inputs in the correct format but they are here if you would like to compare: https://github.com/jstansfield0/brain/tree/master/brain

Do you know what is causing these errors?

msauria commented 7 years ago

These errors are occurring because there is an error in the code for loading data in HiFive through the analysis script. It should be fixed with this pull request: https://github.com/kundajelab/genomedisco/pull/2

oursu commented 7 years ago

Just as an update, hifive has gone through some changes and re-factoring, and these have been incorporated into 3DChromatin_ReplicateQC. Please re-clone this repository (3DChromatin_ReplicateQC) and re-run the install script, to get the latest version.

oursu commented 6 years ago

Please make sure to download the latest version of the code. Re-clone this repository (3DChromatin_ReplicateQC) and re-run the install script.

For now, I will close this issue, as it seems that the errors are fixed. Please re-open if you are still having trouble running the code.