aryeelab / hichipper

A preprocessing and QC pipeline for HiChIP data
MIT License
33 stars 12 forks source link

no mango file #72

Open AnaAzambuja opened 5 years ago

AnaAzambuja commented 5 years ago

Hi, I've being using hichipper in my hichip data set but although the log file indicates the generation of the mango file, I cannot find it in the output folder. Any ideas? p.s. for the same dataset but including a narrowpeak in the .yaml I dont have this problem. Thanks, Ana

Tue Jun 11 11:20:18 EDT 2019: Starting hichipper pipeline v0.7.5 Tue Jun 11 11:20:18 EDT 2019: Executed from: /home/aa733/workdir/HiChip/NF Tue Jun 11 11:20:18 EDT 2019: Output folder: /home/aa733/workdir/HiChip/NF/hichipper_NF_CRnomodel200_output Tue Jun 11 11:20:18 EDT 2019: Parsed manifest as follows: {'peaks': ['H3K27Ac_12_NoIgG99NoModel200_peaks.narrowPeak'], 'hicpro_output': ['hicpro_output_NF'], 'resfrags': ['MboI_resfrag_gg5filt.bed']} Tue Jun 11 11:20:18 EDT 2019: Determined that the following samples are good to go: ['NF_H3K27Ac'] Tue Jun 11 11:20:18 EDT 2019: User defined peaks specification: H3K27Ac_12_NoIgG99NoModel200_peaks.narrowPeak Tue Jun 11 11:20:18 EDT 2019: Using user-defined peaks file H3K27Ac_12_NoIgG99NoModel200_peaks.narrowPeak for analysis. Tue Jun 11 11:20:18 EDT 2019: Performing restriction fragment-aware padding Tue Jun 11 11:20:25 EDT 2019: Processing NF_H3K27Ac Tue Jun 11 11:20:25 EDT 2019: Total_PETs=142466834 Tue Jun 11 11:20:32 EDT 2019: Mapped_unique_quality_pairs=166214407 Tue Jun 11 11:20:35 EDT 2019: Mapped_unique_quality_valid_pairs=75193701 Tue Jun 11 11:20:35 EDT 2019: Intersecting PETs with anchors Tue Jun 11 11:20:35 EDT 2019: Finished the anchor merging. Tue Jun 11 11:31:06 EDT 2019: Intrachromosomal_valid_small=7602291 Tue Jun 11 11:31:44 EDT 2019: Intrachromosomal_valid_med=48564075 Tue Jun 11 11:32:08 EDT 2019: Intrachromosomal_valid_large=8962160 Tue Jun 11 11:32:08 EDT 2019: Total number of anchors used: 52595 Tue Jun 11 11:32:08 EDT 2019: Total number of reads in anchors: 61168410 Tue Jun 11 11:35:29 EDT 2019: Mapped_unique_intra_quality_anchor=2564744 Tue Jun 11 11:35:29 EDT 2019: Mapped_unique_intra_quality_anchor_small=517533 Tue Jun 11 11:35:29 EDT 2019: Mapped_unique_intra_quality_anchor_med=1760415 Tue Jun 11 11:35:29 EDT 2019: Mapped_unique_intra_quality_anchor_large=286807 Tue Jun 11 11:35:30 EDT 2019: Creating UCSC Compatible files; make sure tabix and bgzip are available in the environment or this will not work. Tue Jun 11 11:35:35 EDT 2019: Loop_PETs=1760415 Tue Jun 11 11:35:35 EDT 2019: Creating QC report Tue Jun 11 11:35:39 EDT 2019: Creating .rds and .mango files Tue Jun 11 11:36:59 EDT 2019: Deleting temporary files Tue Jun 11 11:36:59 EDT 2019: Done

caleblareau commented 5 years ago

I don’t see anything from the QC report that would cause this issue. Can you show me what’s in the output folder?

On Jun 12, 2019, at 10:32 AM, AnaAzambuja notifications@github.com wrote:

Hi, I've being using hichipper in my hichip data set but although the log file indicates the generation of the mango file, I cannot find it in the output folder. Any ideas? p.s. for the same dataset but including a narrowpeak in the .yaml I dont have this problem. Thanks, Ana

Tue Jun 11 11:20:18 EDT 2019: Starting hichipper pipeline v0.7.5 Tue Jun 11 11:20:18 EDT 2019: Executed from: /home/aa733/workdir/HiChip/NF Tue Jun 11 11:20:18 EDT 2019: Output folder: /home/aa733/workdir/HiChip/NF/hichipper_NF_CRnomodel200_output Tue Jun 11 11:20:18 EDT 2019: Parsed manifest as follows: {'peaks': ['H3K27Ac_12_NoIgG99NoModel200_peaks.narrowPeak'], 'hicpro_output': ['hicpro_output_NF'], 'resfrags': ['MboI_resfrag_gg5filt.bed']} Tue Jun 11 11:20:18 EDT 2019: Determined that the following samples are good to go: ['NF_H3K27Ac'] Tue Jun 11 11:20:18 EDT 2019: User defined peaks specification: H3K27Ac_12_NoIgG99NoModel200_peaks.narrowPeak Tue Jun 11 11:20:18 EDT 2019: Using user-defined peaks file H3K27Ac_12_NoIgG99NoModel200_peaks.narrowPeak for analysis. Tue Jun 11 11:20:18 EDT 2019: Performing restriction fragment-aware padding Tue Jun 11 11:20:25 EDT 2019: Processing NF_H3K27Ac Tue Jun 11 11:20:25 EDT 2019: Total_PETs=142466834 Tue Jun 11 11:20:32 EDT 2019: Mapped_unique_quality_pairs=166214407 Tue Jun 11 11:20:35 EDT 2019: Mapped_unique_quality_valid_pairs=75193701 Tue Jun 11 11:20:35 EDT 2019: Intersecting PETs with anchors Tue Jun 11 11:20:35 EDT 2019: Finished the anchor merging. Tue Jun 11 11:31:06 EDT 2019: Intrachromosomal_valid_small=7602291 Tue Jun 11 11:31:44 EDT 2019: Intrachromosomal_valid_med=48564075 Tue Jun 11 11:32:08 EDT 2019: Intrachromosomal_valid_large=8962160 Tue Jun 11 11:32:08 EDT 2019: Total number of anchors used: 52595 Tue Jun 11 11:32:08 EDT 2019: Total number of reads in anchors: 61168410 Tue Jun 11 11:35:29 EDT 2019: Mapped_unique_intra_quality_anchor=2564744 Tue Jun 11 11:35:29 EDT 2019: Mapped_unique_intra_quality_anchor_small=517533 Tue Jun 11 11:35:29 EDT 2019: Mapped_unique_intra_quality_anchor_med=1760415 Tue Jun 11 11:35:29 EDT 2019: Mapped_unique_intra_quality_anchor_large=286807 Tue Jun 11 11:35:30 EDT 2019: Creating UCSC Compatible files; make sure tabix and bgzip are available in the environment or this will not work. Tue Jun 11 11:35:35 EDT 2019: Loop_PETs=1760415 Tue Jun 11 11:35:35 EDT 2019: Creating QC report Tue Jun 11 11:35:39 EDT 2019: Creating .rds and .mango files Tue Jun 11 11:36:59 EDT 2019: Deleting temporary files Tue Jun 11 11:36:59 EDT 2019: Done

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/aryeelab/hichipper/issues/72?email_source=notifications&email_token=AD32FYMLQR6MOFS3BSLJTCLP2ECH5A5CNFSM4HXJTURKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4GZCRYDA, or mute the thread https://github.com/notifications/unsubscribe-auth/AD32FYNURGW4K7DR4JD5ZN3P2ECH5ANCNFSM4HXJTURA.

AnaAzambuja commented 5 years ago

NF_H3K27Ac.filt.intra.loop_counts.bedpe hichipper_NF_output.hichipper.log hichipper_NF_output.hichipper.qcreport.html NF_H3K27Ac.stat NF_H3K27Ac.interaction.txt.gz.tbi NF_H3K27Ac.anchors.bed NF_H3K27Ac.interaction.txt.gz NF_H3K27Ac.intra.loop_counts.bedpe NF_H3K27Ac.inter.loop_counts.bedpe hichipper_NF_output.hichipper.qcreport_files

Thanks

AnaAzambuja commented 5 years ago

Hi Caleb, sorry bothering you, but you have any idea why this is happening? I got the mango file before with different datasets...

Thanks, Ana

caleblareau commented 5 years ago

Yess sorry; can you send the ls -lrth of the output folder? I just want to verify that all of the files are non-empty

On Jun 13, 2019, at 10:46 PM, AnaAzambuja notifications@github.com wrote:

Hi Caleb, sorry bothering you, but you have any idea why this is happening? I got the mango file before with different datasets...

Thanks, Ana

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/aryeelab/hichipper/issues/72?email_source=notifications&email_token=AD32FYPYHFGFFL62GK6M4T3P2MBBNA5CNFSM4HXJTURKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODXVR2FI#issuecomment-501947669, or mute the thread https://github.com/notifications/unsubscribe-auth/AD32FYI2XIXVJA2JPTMVMY3P2MBBNANCNFSM4HXJTURA.

AnaAzambuja commented 5 years ago

total 1.3G -rw-rw-r-- 1 aa733 aa733 271M Jun 13 18:22 NF_H3K27Ac.inter.loop_counts.bedpe -rw-rw-r-- 1 aa733 aa733 582M Jun 13 18:23 NF_H3K27Ac.intra.loop_counts.bedpe -rw-rw-r-- 1 aa733 aa733 345M Jun 13 18:23 NF_H3K27Ac.filt.intra.loop_counts.bedpe -rw-rw-r-- 1 aa733 aa733 129M Jun 13 18:23 NF_H3K27Ac.interaction.txt.gz -rw-rw-r-- 1 aa733 aa733 205K Jun 13 18:23 NF_H3K27Ac.interaction.txt.gz.tbi -rw-rw-r-- 1 aa733 aa733 2.7M Jun 13 18:23 NF_H3K27Ac.anchors.bed -rw-rw-r-- 1 aa733 aa733 543 Jun 13 18:24 NF_H3K27Ac.stat -rw-rw-r-- 1 aa733 aa733 369K Jun 13 18:24 hichipper_NF_output.hichipper.qcreport.html drwxrwxr-x 2 aa733 aa733 4.0K Jun 13 18:24 hichipper_NF_output.hichipper.qcreport_files -rw-rw-r-- 1 aa733 aa733 2.4K Jun 13 18:31 hichipper_NF_output.hichipper.log

Thanks, Ana

AnaAzambuja commented 5 years ago

HI, just saw that initially I sent you the wrong log file. Here is the correct one, using the "peaks specification" as "COMBINED,ALL'( I also tried EACH, with the same error). Again it says the the mango file is created, but It never appears in my folder.

Thanks, Ana

Wed Jun 12 08:25:39 EDT 2019: Starting hichipper pipeline v0.7.5 Wed Jun 12 08:25:39 EDT 2019: Executed from: /home/aa733/workdir/HiChip/NF Wed Jun 12 08:25:39 EDT 2019: Output folder: /home/aa733/workdir/HiChip/NF/hichipper_NF_output Wed Jun 12 08:25:39 EDT 2019: Parsed manifest as follows: {'peaks': ['COMBINED,ALL'], 'hicpro_output': ['hicpro_output_NF'], 'resfrags': ['MboI_resfrag_gg5filt.bed']} Wed Jun 12 08:25:39 EDT 2019: Determined that the following samples are good to go: ['NF_H3K27Ac'] Wed Jun 12 08:25:39 EDT 2019: User defined peaks specification: COMBINED,ALL Wed Jun 12 08:25:39 EDT 2019: Calling one set of peaks from all HiChIP reads across all samples. Wed Jun 12 08:29:00 EDT 2019: macs2 command: macs2 callpeak -t hichipper_NF_output/allpairs.bed.tmp --keep-dup all -q 0.01 --extsize 147 --nomodel -g hs -B -f BED --verbose 0 -n hichipper_NF_output/allSamples_temporary Wed Jun 12 09:25:05 EDT 2019: Performing restriction fragment-aware padding Wed Jun 12 09:25:14 EDT 2019: Processing NF_H3K27Ac Wed Jun 12 09:25:14 EDT 2019: Total_PETs=142466834 Wed Jun 12 09:25:22 EDT 2019: Mapped_unique_quality_pairs=166214407 Wed Jun 12 09:25:26 EDT 2019: Mapped_unique_quality_valid_pairs=75193701 Wed Jun 12 09:25:26 EDT 2019: Intersecting PETs with anchors Wed Jun 12 09:25:26 EDT 2019: Finished the anchor merging. Wed Jun 12 09:37:43 EDT 2019: Intrachromosomal_valid_small=7602291 Wed Jun 12 09:38:21 EDT 2019: Intrachromosomal_valid_med=48564075 Wed Jun 12 09:38:45 EDT 2019: Intrachromosomal_valid_large=8962160 Wed Jun 12 09:38:45 EDT 2019: Total number of anchors used: 120953 Wed Jun 12 09:38:45 EDT 2019: Total number of reads in anchors: 248479915 Wed Jun 12 09:47:01 EDT 2019: Mapped_unique_intra_quality_anchor=36678808 Wed Jun 12 09:47:01 EDT 2019: Mapped_unique_intra_quality_anchor_small=4788051 Wed Jun 12 09:47:01 EDT 2019: Mapped_unique_intra_quality_anchor_med=26954678 Wed Jun 12 09:47:01 EDT 2019: Mapped_unique_intra_quality_anchor_large=4936186 Wed Jun 12 09:47:16 EDT 2019: Creating UCSC Compatible files; make sure tabix and bgzip are available in the environment or this will not work. Wed Jun 12 09:48:06 EDT 2019: Loop_PETs=26954678 Wed Jun 12 09:48:06 EDT 2019: Creating QC report Wed Jun 12 09:48:35 EDT 2019: Creating .rds and .mango files Wed Jun 12 09:54:13 EDT 2019: Deleting temporary files Wed Jun 12 09:54:13 EDT 2019: Done

caleblareau commented 5 years ago

Hi,

I ave looked over this and I still don’t have a clear recommendation for why this is failing. One option would be to read the bedpe into diffloop and run the mangoCorrection function manually. That would likely reproduce whatever error is occurring internally with hichipper.

On Jun 22, 2019, at 1:47 PM, AnaAzambuja notifications@github.com wrote:

HI, just saw that initially I sent you the wrong log file. Here is the correct one, using the "peaks specification" as "COMBINED,ALL'( I also tried EACH, with the same error). Again it says the the mango file is created, but It never appears in my folder.

Thanks, Ana

Wed Jun 12 08:25:39 EDT 2019: Starting hichipper pipeline v0.7.5 Wed Jun 12 08:25:39 EDT 2019: Executed from: /home/aa733/workdir/HiChip/NF Wed Jun 12 08:25:39 EDT 2019: Output folder: /home/aa733/workdir/HiChip/NF/hichipper_NF_output Wed Jun 12 08:25:39 EDT 2019: Parsed manifest as follows: {'peaks': ['COMBINED,ALL'], 'hicpro_output': ['hicpro_output_NF'], 'resfrags': ['MboI_resfrag_gg5filt.bed']} Wed Jun 12 08:25:39 EDT 2019: Determined that the following samples are good to go: ['NF_H3K27Ac'] Wed Jun 12 08:25:39 EDT 2019: User defined peaks specification: COMBINED,ALL Wed Jun 12 08:25:39 EDT 2019: Calling one set of peaks from all HiChIP reads across all samples. Wed Jun 12 08:29:00 EDT 2019: macs2 command: macs2 callpeak -t hichipper_NF_output/allpairs.bed.tmp --keep-dup all -q 0.01 --extsize 147 --nomodel -g hs -B -f BED --verbose 0 -n hichipper_NF_output/allSamples_temporary Wed Jun 12 09:25:05 EDT 2019: Performing restriction fragment-aware padding Wed Jun 12 09:25:14 EDT 2019: Processing NF_H3K27Ac Wed Jun 12 09:25:14 EDT 2019: Total_PETs=142466834 Wed Jun 12 09:25:22 EDT 2019: Mapped_unique_quality_pairs=166214407 Wed Jun 12 09:25:26 EDT 2019: Mapped_unique_quality_valid_pairs=75193701 Wed Jun 12 09:25:26 EDT 2019: Intersecting PETs with anchors Wed Jun 12 09:25:26 EDT 2019: Finished the anchor merging. Wed Jun 12 09:37:43 EDT 2019: Intrachromosomal_valid_small=7602291 Wed Jun 12 09:38:21 EDT 2019: Intrachromosomal_valid_med=48564075 Wed Jun 12 09:38:45 EDT 2019: Intrachromosomal_valid_large=8962160 Wed Jun 12 09:38:45 EDT 2019: Total number of anchors used: 120953 Wed Jun 12 09:38:45 EDT 2019: Total number of reads in anchors: 248479915 Wed Jun 12 09:47:01 EDT 2019: Mapped_unique_intra_quality_anchor=36678808 Wed Jun 12 09:47:01 EDT 2019: Mapped_unique_intra_quality_anchor_small=4788051 Wed Jun 12 09:47:01 EDT 2019: Mapped_unique_intra_quality_anchor_med=26954678 Wed Jun 12 09:47:01 EDT 2019: Mapped_unique_intra_quality_anchor_large=4936186 Wed Jun 12 09:47:16 EDT 2019: Creating UCSC Compatible files; make sure tabix and bgzip are available in the environment or this will not work. Wed Jun 12 09:48:06 EDT 2019: Loop_PETs=26954678 Wed Jun 12 09:48:06 EDT 2019: Creating QC report Wed Jun 12 09:48:35 EDT 2019: Creating .rds and .mango files Wed Jun 12 09:54:13 EDT 2019: Deleting temporary files Wed Jun 12 09:54:13 EDT 2019: Done

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/aryeelab/hichipper/issues/72?email_source=notifications&email_token=AD32FYIKCMWRZNGM5KIDYTLP3ZQUBA5CNFSM4HXJTURKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYKOHGY#issuecomment-504685467, or mute the thread https://github.com/notifications/unsubscribe-auth/AD32FYL2XYRTYN7OSDKOOHLP3ZQUBANCNFSM4HXJTURA.