kundajelab / 3DChromatin_ReplicateQC

Software to compute reproducibility and quality scores for Hi-C data
MIT License
43 stars 16 forks source link

The format of my input files are the same as your example files, but it can't run. #16

Open HappyLife-together opened 3 years ago

HappyLife-together commented 3 years ago

0 validly-mapped read pairs loaded.
No valid data was loaded. Traceback (most recent call last): File "anaconda3/envs/3DC_env/bin/hifive", line 849, in main() File "anaconda3/envs/3DC_env/bin/hifive", line 93, in main run(args) File "anaconda3/envs/3DC_env/lib/python2.7/site-packages/hifive/commands/find_quasar_scores.py", line 114, in run coverages=args.coverages, seed=args.seed) File "anaconda3/envs/3DC_env/lib/python2.7/site-packages/hifive/quasar.py", line 190, in find_transformation elif hic.data['cis_indices'][chr_indices[i + 1]] - hic.data['cis_indices'][chr_indices[i]] == 0: File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "anaconda3/envs/3DC_env/lib/python2.7/site-packages/h5py/_hl/group.py", line 264, in getitem oid = h5o.open(self.id, self._e(name), lapl=self._lapl) File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "h5py/h5o.pyx", line 190, in h5py.h5o.open KeyError: "Unable to open object (object 'cis_indices' doesn't exist)" 0 validly-mapped read pairs loaded.
No valid data was loaded. Traceback (most recent call last): File "anaconda3/envs/3DC_env/bin/hifive", line 849, in main() File "anaconda3/envs/3DC_env/bin/hifive", line 93, in main run(args) File "anaconda3/envs/3DC_env/lib/python2.7/site-packages/hifive/commands/find_quasar_scores.py", line 114, in run coverages=args.coverages, seed=args.seed) File "anaconda3/envs/3DC_env/lib/python2.7/site-packages/hifive/quasar.py", line 190, in find_transformation elif hic.data['cis_indices'][chr_indices[i + 1]] - hic.data['cis_indices'][chr_indices[i]] == 0: File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File anaconda3/envs/3DC_env/lib/python2.7/site-packages/h5py/_hl/group.py", line 264, in getitem oid = h5o.open(self.id, self._e(name), lapl=self._lapl) File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "h5py/h5o.pyx", line 190, in h5py.h5o.open KeyError: "Unable to open object (object 'cis_indices' doesn't exist)" Step: preprocess | Tue Dec 15 20:57:10 2020 | Splitting nodes chr18 Step: preprocess | Tue Dec 15 20:57:10 2020 | Splitting hr chr18 Step: preprocess | Tue Dec 15 20:58:22 2020 | Splitting lr chr18 Step: qc | Tue Dec 15 20:59:19 2020 | running QuASAR-QC | computing QC for hr Quasar file appears incomplete. Rerun with HiC project argument. Traceback (most recent call last): File "3DChromatin_ReplicateQC/wrappers/QuASAR/quasar_split_by_chromosomes_qc.py", line 29, in main() File "3DChromatin_ReplicateQC/wrappers/QuASAR/quasar_split_by_chromosomes_qc.py", line 10, in main scorefile = open( sys.argv[1], 'r') IOError: [Errno 2] No such file or directory: 'data/output/results/qc/hr/QuASAR-QC/hr.QuASAR-QC.scores.txt' Traceback (most recent call last): File "envs/3DC_env/bin/3DChromatin_ReplicateQC", line 11, in load_entry_point('3DChromatin-ReplicateQC', 'console_scripts', '3DChromatin_ReplicateQC')() File "3DChromatin_ReplicateQC/3DChromatin_ReplicateQC/main.py", line 17, in main command_methodscommand File "3DChromatin_ReplicateQC/software/genomedisco/genomedisco/concordance_utils.py", line 848, in run_all get_qc(metadata_samples,methods,outdir,running_mode,concise_analysis,subset_chromosomes,timing) File "3DChromatin_ReplicateQC/software/genomedisco/genomedisco/concordance_utils.py", line 548, in get_qc quasar_qc_wrapper(outdir,parameters,samplename,running_mode,timing) File "3DChromatin_ReplicateQC/software/genomedisco/genomedisco/concordance_utils.py", line 335, in quasar_qc_wrapper run_script(script_comparison_file,running_mode,parameters) File "3DChromatin_ReplicateQC/software/genomedisco/genomedisco/concordance_utils.py", line 266, in run_script output=subp.check_output(['bash','-c',script_name]) File "anaconda3/envs/3DC_env/lib/python2.7/subprocess.py", line 223, in check_output raise CalledProcessError(retcode, cmd, output=output) subprocess.CalledProcessError: Command '['bash', '-c', 'data/output/scripts/QuASAR-QC/hr/hr.QuASAR-QC.sh']' returned non-zero exit status 1

oursu-broad commented 3 years ago

Hi @DerekBin , thanks for posting the issue. This appears to be a problem related to QuASAR (and specifically the hifive package they use https://github.com/bxlab/hifive), so you can also post the issue to them, since they would be most helpful. The 3DChromatin_ReplicateQC package is just a wrapper to run these underlying reproducibility metrics. If you want to write to the QuASAR authors, you can find the command that has been used specifically for this method at data/output/scripts/QuASAR-QC/hr/hr.QuASAR-QC.sh (from the error trace i see above).

In the meantime, if you share some example files that recapitulate the problem, I can try to help troubleshoot. Thanks.

HappyLife-together commented 3 years ago

Thank you very much for your answer! The same problem occurs when I try to run GenomeDISCO alone. And I tried to email you my input files but failed. Below is an example of my input file: hr.res10000.gz: 18 39400000 18 32180000 0.0 18 39400000 18 32190000 0.0 18 39400000 18 32200000 3.094195 18 39400000 18 32210000 2.8515623 18 39400000 18 32220000 0.0 lr.res10000.gz: 18 39400000 18 32180000 0.0 18 39400000 18 32190000 0.0 18 39400000 18 32200000 1.0379004 18 39400000 18 32210000 0.0 18 39400000 18 32220000 0.0 Bins_new.w10000.bed.gz: chr18 39390000 39400000 39390000 chr18 39400000 39410000 39400000 chr18 39410000 39420000 39410000 chr18 39420000 39430000 39420000 chr18 39430000 39440000 39430000 chr18 39440000 39450000 39440000 chr18 39450000 39460000 39450000

Hope you can provide me to help solve this problem or can you provide a valid email address? thank you!

oursu-broad commented 3 years ago

Can you try to have the same chromosome names in all the files? Either change all "18" to "chr18" or the other way around. If that doesn't fix things, let me know.

HappyLife-together commented 3 years ago

I took your opinion, but it still reported an error, the error is the same as before. Below is an example of my input files: hr.res10000.gz: chr18 39400000 chr18 32180000 0.0 chr18 39400000 chr18 32190000 0.0 chr18 39400000 chr18 32200000 3.094195 chr18 39400000 chr18 32210000 2.8515623 chr18 39400000 chr18 32220000 0.0 lr.res10000.gz: chr18 39400000 chr18 32180000 0.0 chr18 39400000 chr18 32190000 0.0 chr18 39400000 chr18 32200000 1.0379004 chr18 39400000 chr18 32210000 0.0 chr18 39400000 chr18 32220000 0.0 Bins_new.w10000.bed.gz: chr18 39390000 39400000 39390000 chr18 39400000 39410000 39400000 chr18 39410000 39420000 39410000 chr18 39420000 39430000 39420000 chr18 39430000 39440000 39430000 chr18 39440000 39450000 39440000 chr18 39450000 39460000 39450000

0 validly-mapped read pairs loaded.
No valid data was loaded. Traceback (most recent call last): File "/anaconda3/envs/3DC_env/bin/hifive", line 849, in main() File "anaconda3/envs/3DC_env/bin/hifive", line 93, in main run(args) File "anaconda3/envs/3DC_env/lib/python2.7/site-packages/hifive/commands/find_quasar_scores.py", line 114, in run coverages=args.coverages, seed=args.seed) File "anaconda3/envs/3DC_env/lib/python2.7/site-packages/hifive/quasar.py", line 190, in find_transformation elif hic.data['cis_indices'][chr_indices[i + 1]] - hic.data['cis_indices'][chr_indices[i]] == 0: File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "anaconda3/envs/3DC_env/lib/python2.7/site-packages/h5py/_hl/group.py", line 264, in getitem oid = h5o.open(self.id, self._e(name), lapl=self._lapl) File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "h5py/h5o.pyx", line 190, in h5py.h5o.open KeyError: "Unable to open object (object 'cis_indices' doesn't exist)" 0 validly-mapped read pairs loaded.
No valid data was loaded. Traceback (most recent call last): File "anaconda3/envs/3DC_env/bin/hifive", line 849, in main() File "anaconda3/envs/3DC_env/bin/hifive", line 93, in main run(args) File "anaconda3/envs/3DC_env/lib/python2.7/site-packages/hifive/commands/find_quasar_scores.py", line 114, in run coverages=args.coverages, seed=args.seed) File "anaconda3/envs/3DC_env/lib/python2.7/site-packages/hifive/quasar.py", line 190, in find_transformation elif hic.data['cis_indices'][chr_indices[i + 1]] - hic.data['cis_indices'][chr_indices[i]] == 0: File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "anaconda3/envs/3DC_env/lib/python2.7/site-packages/h5py/_hl/group.py", line 264, in getitem oid = h5o.open(self.id, self._e(name), lapl=self._lapl) File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "h5py/h5o.pyx", line 190, in h5py.h5o.open KeyError: "Unable to open object (object 'cis_indices' doesn't exist)" Step: preprocess | Thu Dec 17 14:00:35 2020 | Splitting nodes chr18 Step: preprocess | Thu Dec 17 14:00:35 2020 | Splitting hr chr18 Step: preprocess | Thu Dec 17 14:01:36 2020 | Splitting lr chr18 Step: qc | Thu Dec 17 14:02:42 2020 | running QuASAR-QC | computing QC for hr Quasar file appears incomplete. Rerun with HiC project argument. Traceback (most recent call last): File "3DChromatin_ReplicateQC/wrappers/QuASAR/quasar_split_by_chromosomes_qc.py", line 29, in main() File "3DChromatin_ReplicateQC/wrappers/QuASAR/quasar_split_by_chromosomes_qc.py", line 10, in main scorefile = open( sys.argv[1], 'r') IOError: [Errno 2] No such file or directory: 'data/output/results/qc/hr/QuASAR-QC/hr.QuASAR-QC.scores.txt' Traceback (most recent call last): File "anaconda3/envs/3DC_env/bin/3DChromatin_ReplicateQC", line 11, in load_entry_point('3DChromatin-ReplicateQC', 'console_scripts', '3DChromatin_ReplicateQC')() File "3DChromatin_ReplicateQC/3DChromatin_ReplicateQC/main.py", line 17, in main command_methodscommand File "3DChromatin_ReplicateQC/software/genomedisco/genomedisco/concordance_utils.py", line 848, in run_all get_qc(metadata_samples,methods,outdir,running_mode,concise_analysis,subset_chromosomes,timing) File "3DChromatin_ReplicateQC/software/genomedisco/genomedisco/concordance_utils.py", line 548, in get_qc quasar_qc_wrapper(outdir,parameters,samplename,running_mode,timing) File "3DChromatin_ReplicateQC/software/genomedisco/genomedisco/concordance_utils.py", line 335, in quasar_qc_wrapper run_script(script_comparison_file,running_mode,parameters) File "3DChromatin_ReplicateQC/software/genomedisco/genomedisco/concordance_utils.py", line 266, in run_script output=subp.check_output(['bash','-c',script_name]) File "anaconda3/envs/3DC_env/lib/python2.7/subprocess.py", line 223, in check_output raise CalledProcessError(retcode, cmd, output=output) subprocess.CalledProcessError: Command '['bash', '-c', 'data/output/scripts/QuASAR-QC/hr/hr.QuASAR-QC.sh']' returned non-zero exit status 1

Devalock commented 1 year ago

This problem is caused by the floating numbers in HiC map. Solution can be found in https://github.com/kundajelab/3DChromatin_ReplicateQC/issues/13.