aryeelab / hichipper

A preprocessing and QC pipeline for HiChIP data
MIT License
33 stars 12 forks source link

Something went wrong in determining peaks for anchor inference; rerun with the flag to debug. #91

Open Gemma-Zhang-326 opened 3 years ago

Gemma-Zhang-326 commented 3 years ago

Hi, there. I am trying to using hichipper to call loop. In order to call peaks from HiChIP data directly, my yaml looks like under

peaks:
  - EACH,ALL
resfrags:
  - /gpfs/home/zhangyue/biosoft/HiCPro/HiC-Pro_3.0.0/annotation/AluI_Mmul_10.bed
hicpro_output:
  - hicpro

Then i got an error

Mon Aug 09 11:39:47 CST 2021: Starting hichipper pipeline v0.7.7
Mon Aug 09 11:39:47 CST 2021: Executed from: /gpfs/home/zhangyue/V1_hic
Mon Aug 09 11:39:47 CST 2021: Output folder: /gpfs/home/zhangyue/V1_hic/V1_loop3
Mon Aug 09 11:39:47 CST 2021: Parsed manifest as follows: 
{'peaks': ['EACH,ALL'], 'hicpro_output': ['hicpro'], 'resfrags': ['/gpfs/home/zhangyue/biosoft/HiCPro/HiC-Pro_3.0.0/annotation/AluI_Mmul_10.bed']}
Mon Aug 09 11:39:47 CST 2021: Determined that the following samples are good to go: 
['RM01-V1-2', 'RM01-V1-1']
Mon Aug 09 11:39:47 CST 2021: User defined peaks specification: EACH,ALL
Mon Aug 09 11:39:47 CST 2021: Calling peaks from all reads for each sample
Mon Aug 09 11:42:37 CST 2021: macs2 command: macs2 callpeak -t V1_loop3/RM01-V1-2.all.Pairs.bed.tmp  --keep-dup all -q 0.01 --extsize 147 --nomodel -g 2.8e+9 -B -f BED --verbose 0 -n V1_loop3/RM01-V1-2_temporary
Mon Aug 09 12:07:36 CST 2021: macs2 command: macs2 callpeak -t V1_loop3/RM01-V1-1.all.Pairs.bed.tmp  --keep-dup all -q 0.01 --extsize 147 --nomodel -g 2.8e+9 -B -f BED --verbose 0 -n V1_loop3/RM01-V1-1_temporary
Mon Aug 09 12:31:38 CST 2021: Performing restriction fragment-aware padding
Mon Aug  9 12:32:32 CST 2021: Processing RM01-V1-2
Mon Aug  9 12:32:32 CST 2021: Total_PETs=112147382
Mon Aug  9 12:32:38 CST 2021: Mapped_unique_quality_pairs=77830832
Mon Aug  9 12:32:41 CST 2021: Mapped_unique_quality_valid_pairs=35044129
Mon Aug  9 12:32:41 CST 2021: Intersecting PETs with anchors
Mon Aug  9 12:32:41 CST 2021: Finished the anchor merging.
Mon Aug  9 12:32:41 CST 2021: Something went wrong in determining peaks for anchor inference; rerun with the  flag to debug.

Then I check the log file for details

/gpfs/home/zhangyue/anaconda3/envs/hichipper/lib/python2.7/site-packages/hichipper/hichipperHelp.py:98: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
  m = yaml.load(f)
Error in `[[<-`(`*tmp*`, name, value = c(279L, 77L, 13L, 105L, 0L, 51L,  : 
  68217449 elements in value to replace 68217452 elements
Calls: computeRatioEtc -> $<- -> $<- -> [[<- -> [[<-
Execution halted
INFO  @ Mon, 09 Aug 2021 12:04:29: Read and build treatment bedGraph... 
Traceback (most recent call last):
  File "/gpfs/home/zhangyue/anaconda3/envs/hichipper/bin/macs2", line 617, in <module>
    main()
  File "/gpfs/home/zhangyue/anaconda3/envs/hichipper/bin/macs2", line 73, in main
    run( args )
  File "/gpfs/home/zhangyue/anaconda3/envs/hichipper/lib/python2.7/site-packages/MACS2/bdgcmp_cmd.py", line 40, in run
    tbtrack = tbio.build_bdgtrack()
  File "MACS2/IO/BedGraphIO.pyx", line 97, in MACS2.IO.BedGraphIO.bedGraphIO.build_bdgtrack (MACS2/IO/BedGraphIO.c:1079)
IOError: [Errno 2] No such file or directory: 'V1_loop3/adjustedTreatment.bdg.tmp'
INFO  @ Mon, 09 Aug 2021 12:04:29: Read and build bedGraph... 
Traceback (most recent call last):
  File "/gpfs/home/zhangyue/anaconda3/envs/hichipper/bin/macs2", line 617, in <module>
    main()
  File "/gpfs/home/zhangyue/anaconda3/envs/hichipper/bin/macs2", line 65, in main
    run( args )
  File "/gpfs/home/zhangyue/anaconda3/envs/hichipper/lib/python2.7/site-packages/MACS2/bdgpeakcall_cmd.py", line 51, in run
    btrack = bio.build_bdgtrack(baseline_value=0)
  File "MACS2/IO/BedGraphIO.pyx", line 97, in MACS2.IO.BedGraphIO.bedGraphIO.build_bdgtrack (MACS2/IO/BedGraphIO.c:1079)
IOError: [Errno 2] No such file or directory: 'V1_loop3/hichipper_qvalue.bdg.tmp'
mv: cannot stat ‘V1_loop3/hichipper_peaks.bed’: No such file or directory
Error in `[[<-`(`*tmp*`, name, value = c(357L, 46L, 43L, 2L, 63L, 31L,  : 
  72244713 elements in value to replace 72244714 elements
Calls: computeRatioEtc -> $<- -> $<- -> [[<- -> [[<-
Execution halted
INFO  @ Mon, 09 Aug 2021 12:31:38: Read and build treatment bedGraph... 
Traceback (most recent call last):
  File "/gpfs/home/zhangyue/anaconda3/envs/hichipper/bin/macs2", line 617, in <module>
    main()
  File "/gpfs/home/zhangyue/anaconda3/envs/hichipper/bin/macs2", line 73, in main
    run( args )
  File "/gpfs/home/zhangyue/anaconda3/envs/hichipper/lib/python2.7/site-packages/MACS2/bdgcmp_cmd.py", line 40, in run
    tbtrack = tbio.build_bdgtrack()
  File "MACS2/IO/BedGraphIO.pyx", line 97, in MACS2.IO.BedGraphIO.bedGraphIO.build_bdgtrack (MACS2/IO/BedGraphIO.c:1079)
IOError: [Errno 2] No such file or directory: 'V1_loop3/adjustedTreatment.bdg.tmp'
INFO  @ Mon, 09 Aug 2021 12:31:38: Read and build bedGraph... 
Traceback (most recent call last):
  File "/gpfs/home/zhangyue/anaconda3/envs/hichipper/bin/macs2", line 617, in <module>
    main()
  File "/gpfs/home/zhangyue/anaconda3/envs/hichipper/bin/macs2", line 65, in main
    run( args )
  File "/gpfs/home/zhangyue/anaconda3/envs/hichipper/lib/python2.7/site-packages/MACS2/bdgpeakcall_cmd.py", line 51, in run
    btrack = bio.build_bdgtrack(baseline_value=0)
  File "MACS2/IO/BedGraphIO.pyx", line 97, in MACS2.IO.BedGraphIO.bedGraphIO.build_bdgtrack (MACS2/IO/BedGraphIO.c:1079)
IOError: [Errno 2] No such file or directory: 'V1_loop3/hichipper_qvalue.bdg.tmp'
mv: cannot stat ‘V1_loop3/hichipper_peaks.bed’: No such file or directory
awk: fatal: cannot open file `V1_loop3/RM01-V1-2_temporary_hichipperPeaks.bed' for reading (No such file or directory)
awk: fatal: cannot open file `V1_loop3/RM01-V1-1_temporary_hichipperPeaks.bed' for reading (No such file or directory)
Warning message:
In fread(peaksfile, header = FALSE) :
  File 'V1_loop3/RM01-V1-1_temporary_hichipperPeaks.bed_pad.bed.tmp' has size 0. Returning a NULL data.table.
Error in .find_seqnames_col(df_colnames0, seqnames.field0, prefix) : 
  cannnot find seqnames column
Calls: makeGRangesFromDataFrame -> .find_GRanges_cols -> .find_seqnames_col
Execution halted
Warning message:
In fread(peaksfile, header = FALSE) :
  File 'V1_loop3/RM01-V1-2_temporary_hichipperPeaks.bed_pad.bed.tmp' has size 0. Returning a NULL data.table.
Error in .find_seqnames_col(df_colnames0, seqnames.field0, prefix) : 
  cannnot find seqnames column
Calls: makeGRangesFromDataFrame -> .find_GRanges_cols -> .find_seqnames_col
Execution halted
Error: The requested file (V1_loop3/RM01-V1-2_temporary_hichipperPeaks.bed_pad.bed.tmprf.tmp) could not be opened. Error message: (No such file or directory). Exiting!
/gpfs/home/zhangyue/anaconda3/envs/hichipper/lib/python2.7/site-packages/hichipper/interactionsCall.sh: line 43: --keep-temp-files: command not found
ERROR: something failed at the individual sample level; check the .log file for more info

By the way Here is my hichipper command

hichipper --out V1_loop3 --macs2-genome 2.8e+9  example_COMBINED_ALL.yaml

It is strange that when peaks mode COMBINED,SELF seems work fine, except Loop_PETs=1 ..... And the tests data works well. I got no idea what's going on. Anyone can help? Many thanks.

Gemma-Zhang-326 commented 3 years ago

And if I set the peaks mode - COMBINED,SELF The results seems too strange

Mon Aug 09 09:24:55 CST 2021: Starting hichipper pipeline v0.7.7
Mon Aug 09 09:24:55 CST 2021: Executed from: /gpfs/home/zhangyue/V1_hic
Mon Aug 09 09:24:55 CST 2021: Output folder: /gpfs/home/zhangyue/V1_hic/V1_loop
Mon Aug 09 09:24:55 CST 2021: Parsed manifest as follows: 
{'peaks': ['COMBINED,SELF'], 'hicpro_output': ['hicpro'], 'resfrags': ['/gpfs/home/zhangyue/biosoft/HiCPro/HiC-Pro_3.0.0/annotation/AluI_Mmul_10.bed']}
Mon Aug 09 09:24:55 CST 2021: Determined that the following samples are good to go: 
['RM01-V1-2', 'RM01-V1-1']
Mon Aug 09 09:24:55 CST 2021: User defined peaks specification: COMBINED,SELF
Mon Aug 09 09:24:55 CST 2021: Calling one set of peaks from HiChIP self-ligation reads across all samples.
Mon Aug 09 09:24:55 CST 2021: macs2 command: macs2 callpeak -t V1_loop/allDEandSCpairs.bed.tmp --keep-dup all -q 0.01 --extsize 147 --nomodel -g hs -B -f BED --verbose 0 -n V1_loop/allSamples_temporary
Mon Aug 09 09:26:16 CST 2021: Performing restriction fragment-aware padding
Mon Aug  9 09:26:43 CST 2021: Processing RM01-V1-2
Mon Aug  9 09:26:43 CST 2021: Total_PETs=112147382
Mon Aug  9 09:26:49 CST 2021: Mapped_unique_quality_pairs=77830832
Mon Aug  9 09:26:51 CST 2021: Mapped_unique_quality_valid_pairs=35044129
Mon Aug  9 09:26:51 CST 2021: Intersecting PETs with anchors
Mon Aug  9 09:26:52 CST 2021: Finished the anchor merging.
Mon Aug  9 09:26:52 CST 2021: Finished the anchor processing.
Mon Aug  9 09:42:12 CST 2021: Intrachromosomal_valid_small=2025473
Mon Aug  9 09:42:44 CST 2021: Intrachromosomal_valid_med=7076666
Mon Aug  9 09:43:07 CST 2021: Intrachromosomal_valid_large=6888506
Mon Aug  9 09:43:07 CST 2021: Total number of anchors used: 450
Mon Aug  9 09:43:07 CST 2021: Total number of reads in anchors: 63738
Mon Aug  9 09:44:16 CST 2021: Mapped_unique_intra_quality_anchor=147
Mon Aug  9 09:44:16 CST 2021: Mapped_unique_intra_quality_anchor_small=145
Mon Aug  9 09:44:16 CST 2021: Mapped_unique_intra_quality_anchor_med=1
Mon Aug  9 09:44:16 CST 2021: Mapped_unique_intra_quality_anchor_large=1
Mon Aug  9 09:44:16 CST 2021: Loop_PETs=1
Mon Aug  9 09:44:16 CST 2021: Processing RM01-V1-1
Mon Aug  9 09:44:16 CST 2021: Total_PETs=132494051
Mon Aug  9 09:44:23 CST 2021: Mapped_unique_quality_pairs=87635934
Mon Aug  9 09:44:25 CST 2021: Mapped_unique_quality_valid_pairs=38509955
Mon Aug  9 09:44:25 CST 2021: Intersecting PETs with anchors
Mon Aug  9 09:44:25 CST 2021: Finished the anchor merging.
Mon Aug  9 09:44:25 CST 2021: Finished the anchor processing.
Mon Aug  9 10:02:37 CST 2021: Intrachromosomal_valid_small=2523061
Mon Aug  9 10:03:14 CST 2021: Intrachromosomal_valid_med=8317071
Mon Aug  9 10:03:40 CST 2021: Intrachromosomal_valid_large=8037452
Mon Aug  9 10:03:40 CST 2021: Total number of anchors used: 450
Mon Aug  9 10:03:40 CST 2021: Total number of reads in anchors: 68291
Mon Aug  9 10:04:56 CST 2021: Mapped_unique_intra_quality_anchor=196
Mon Aug  9 10:04:56 CST 2021: Mapped_unique_intra_quality_anchor_small=194
Mon Aug  9 10:04:56 CST 2021: Mapped_unique_intra_quality_anchor_med=1
Mon Aug  9 10:04:56 CST 2021: Mapped_unique_intra_quality_anchor_large=1
Mon Aug  9 10:04:56 CST 2021: Loop_PETs=1
Mon Aug 09 10:04:56 CST 2021: Creating QC report
Mon Aug 09 10:05:03 CST 2021: Creating .rds and .mango files
Mon Aug 09 10:05:14 CST 2021: Deleting temporary files
Mon Aug 09 10:05:14 CST 2021: Done
Gemma-Zhang-326 commented 3 years ago

I paste the tmp file above. Looks like there is something wrong during converting macs2 narrowpeak file to hichipperpeaks

-rw-r--r-- 1   4223284628 Aug  9 23:14 RM01-V1-1.all.Pairs.bed.tmp
-rw-r--r-- 1  10544827638 Aug  9 22:25 RM01-V1-1.all.Pairs.tmp
-rw-r--r-- 1   2372840919 Aug 10 03:42 RM01-V1-1_temporary_control_lambda.bdg
-rw-r--r-- 1   0 Aug 10 05:39 RM01-V1-1_temporary_hichipperPeaks.bed_pad.bed.tmp
-rw-r--r-- 1    76800480 Aug 10 03:44 RM01-V1-1_temporary_peaks.narrowPeak
-rw-r--r-- 1    84741639 Aug 10 03:43 RM01-V1-1_temporary_peaks.xls
-rw-r--r-- 1     56950326 Aug 10 03:45 RM01-V1-1_temporary_summits.bed
-rw-r--r-- 1  4437495159 Aug 10 03:42 RM01-V1-1_temporary_treat_pileup.bdg
-rw-r--r-- 1   3737682486 Aug  9 16:13 RM01-V1-2.all.Pairs.bed.tmp
-rw-r--r-- 1   9326858622 Aug  9 15:30 RM01-V1-2.all.Pairs.tmp
-rw-r--r-- 1   2226846957 Aug  9 20:12 RM01-V1-2_temporary_control_lambda.bdg
-rw-r--r-- 1      0 Aug 10 05:39 RM01-V1-2_temporary_hichipperPeaks.bed_pad.bed.tmp
-rw-r--r-- 1          0 Aug 10 05:47 RM01-V1-2_temporary_peaks.merged.bed.tmp
-rw-r--r-- 1     66143251 Aug  9 20:15 RM01-V1-2_temporary_peaks.narrowPeak
-rw-r--r-- 1     72956604 Aug  9 20:14 RM01-V1-2_temporary_peaks.xls
-rw-r--r-- 1    49012109 Aug  9 20:15 RM01-V1-2_temporary_summits.bed
-rw-r--r-- 1   4037265781 Aug  9 20:12 RM01-V1-2_temporary_treat_pileup.bdg
-rw-r--r-- 1    1671 Aug 10 05:47 V1_loop4.hichipper.log
gavinjzg commented 2 years ago

I also encounter this problem now, can you tell me how to solve it?

Gemma-Zhang-326 commented 2 years ago

I also encounter this problem now, can you tell me how to solve it? Hi Try to skip the background correction procedure by adding the parameter --skip-background-correction Hope U succeed!