Unable to generate count table with reference peaks

gbloeb commented 2 years ago

Ran a project with reference peaks, but I am unable to generate a count table for the reference peaks: When the project is run: 1) peaks are still called for each sample individually 2) Each sample/peak_calling_mm10 directory contains both: GLIS3_Ctrl_1_S2_ref_peaks_coverage.bed corresponding to the reference peaks and GLIS3_Ctrl_1_S2_peaks_coverage.bed.gz corresponding to the called peaks

When I run the project processing pipeline, consensus peaks are still generated and the count table is generated with the consensus peaks with the warning: Warning message: In PEPATACr::peakCounts(sample_table, summary_dir, argv$results, : Peak coverage files are not derived from a singular reference peak set.

My config:

# This project config file describes your project. See looper docs for details.
name: GLIS3_ATAC_nolambda_qe-7_sh-30_peaks # The name that summary files will be prefaced with

pep_version: 2.0.0
sample_table: annotation_onlyGLIS3.csv  # sheet listing all samples in the project

looper:  # relative paths are relative to this config file
  output_dir: ~/group/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results
  pipeline_interfaces: ~/pepatac/project_pipeline_interface.yaml  # PATH to the directory where looper will find the pipeline repository.

sample_modifiers:
  append:
    pipeline_interfaces: ~/pepatac/sample_pipeline_interface.yaml
  derive:
    attributes: [read1, read2]
    sources:
      R1: "~/group/bulk_atac/220126_GLIS_ATAC_IMCD3/fastq/{sample_name}_R1_001.fastq.gz"
      R2: "~/group/bulk_atac/220126_GLIS_ATAC_IMCD3/fastq/{sample_name}_R2_001.fastq.gz"
  imply:
    - if:
    organism: ["mouse"]
      then:
    genome: mm10
        prealignment_names: ["mouse_chrM2x"]
        genome_size: "2.3e9"
        frip_ref_peaks: ~/group/bulk_atac/220126_GLIS_ATAC_IMCD3/comb_peak_call/GLIS_doxpctrl_shift.bed_nolambda_q0.0000001_sh-30_peaks.narrowPeak

a sample log file: PEPATAC_log.md

project log file: `### Pipeline run code and environment:

Command: /wynton/protected/home/reiter/gloeb/pepatac/pipelines/pepatac_collator.py --config /wynton/protected/home/reiter/gloeb/pepatac/220126_GLIS_ATAC_IMCD3/config_GLIS_doxpctrl_shift.bed_nolambda_q0.0000001_sh-30_peaks.yaml -O /wynton/protected/home/reiter/gloeb/group/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results -P 1 -M 16G -n GLIS3_ATAC_nolambda_qe-7_sh-30_peaks -r /wynton/protected/home/reiter/gloeb/group/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results/results_pipeline
Compute host: plog1.wynton.ucsf.edu
Working dir: /wynton/group/reiter/gabe/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results
Outfolder: /wynton/protected/home/reiter/gloeb/group/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results/summary/
Pipeline started at: (03-24 21:30:49) elapsed: 0.0 TIME

Version log:

Python version: 3.9.7
Pypiper dir: /wynton/protected/home/reiter/gloeb/miniconda3/envs/pepatac/lib/python3.9/site-packages/pypiper
Pypiper version: 0.12.3
Pipeline dir: /wynton/protected/home/reiter/gloeb/pepatac/pipelines
Pipeline version: 0.0.4

Arguments passed to pipeline:

config_file: /wynton/protected/home/reiter/gloeb/pepatac/220126_GLIS_ATAC_IMCD3/config_GLIS_doxpctrl_shift.bed_nolambda_q0.0000001_sh-30_peaks.yaml
cores: 1
cutoff: 2
dirty: False
force_follow: False
logdev: False
mem: 16G
min_olap: 1
min_score: 5
name: GLIS3_ATAC_nolambda_qe-7_sh-30_peaks
new_start: False
normalized: False
output_parent: /wynton/protected/home/reiter/gloeb/group/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results
poverlap: False
recover: False
results: /wynton/protected/home/reiter/gloeb/group/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results/results_pipeline
silent: False
skip_consensus: False
skip_table: False
testmode: False
verbosity: None

Target to produce: /wynton/protected/home/reiter/gloeb/group/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results/summary/GLIS3_ATAC_nolambda_qe-7_sh-30_peaks_libComplexity.pdf,/wynton/protected/home/reiter/gloeb/group/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results/summary/GLIS3_ATAC_nolambda_qe-7_sh-30_peaks_*_consensusPeaks.narrowPeak,/wynton/protected/home/reiter/gloeb/group/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results/summary/GLIS3_ATAC_nolambda_qe-7_sh-30_peaks_peaks_coverage.tsv

Rscript /wynton/protected/home/reiter/gloeb/pepatac/tools/PEPATAC_summarizer.R /wynton/protected/home/reiter/gloeb/pepatac/220126_GLIS_ATAC_IMCD3/config_GLIS_doxpctrl_shift.bed_nolambda_q0.0000001_sh-30_peaks.yaml /wynton/protected/home/reiter/gloeb/group/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results /wynton/protected/home/reiter/gloeb/group/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results/results_pipeline 2 5 1 (9429)
Loading config file: /wynton/protected/home/reiter/gloeb/pepatac/220126_GLIS_ATAC_IMCD3/config_GLIS_doxpctrl_shift.bed_nolambda_q0.0000001_sh-30_peaks.yaml
Creating stats summary...
Summary (n=4): /wynton/protected/home/reiter/gloeb/group/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results/GLIS3_ATAC_nolambda_qe-7_sh-30_peaks_stats_summary.tsv
Creating assets summary...
Summary (n=4): /wynton/protected/home/reiter/gloeb/group/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results/GLIS3_ATAC_nolambda_qe-7_sh-30_peaks_assets_summary.tsv
Creating summary plots...
4 of 4 library complexity files available.
INFO: Found real counts for GLIS3_Ctrl_1_S2 - Total (M): 126.745044 Unique (M): 108.93747
INFO: Found real counts for GLIS3_Ctrl_2_S7 - Total (M): 111.712142 Unique (M): 96.955292
INFO: Found real counts for GLIS3_Dox_1_S3 - Total (M): 221.935342 Unique (M): 183.353884
INFO: Found real counts for GLIS3_Dox_2_S8 - Total (M): 116.520356 Unique (M): 105.904826

WARNING: y-max value changed from default 139.24586665 to the max real data 201.6892724 Successfully produced project summary plots.

Calculating mm10 consensus peak set from 4 samples... Consensus peak set: /wynton/protected/home/reiter/gloeb/group/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results/summary/GLIS3_ATAC_nolambda_qe-7_sh-30_peaks_mm10_consensusPeaks.narrowPeak

Calculating mm10 peak counts for 4 samples... Counts table: /wynton/protected/home/reiter/gloeb/group/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results/summary/GLIS3_ATAC_nolambda_qe-7_sh-30_peaks_mm10_peaks_coverage.tsv

Counts table: /wynton/protected/home/reiter/gloeb/group/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results/summary/GLIS3_ATAC_nolambda_qe-7_sh-30_peaks_mm10_peaks_coverage.tsv

Warning message: In PEPATACr::peakCounts(sample_table, summary_dir, argv$results, : Peak coverage files are not derived from a singular reference peak set. Command completed. Elapsed time: 0:00:57. Running peak memory: 0.91GB. PID: 9429; Command: Rscript; Return code: 0; Memory used: 0.91GB

Pipeline completed. Epilogue

Elapsed time (this run): 0:00:57
Total elapsed time (all runs): 0:00:57
Peak memory (this run): 0.9104 GB
Pipeline completed time: 2022-03-24 21:31:46

`

Kange2014 commented 2 years ago

does anyone has an update on this issue? encounter the same problem. Thanks.

ljmills commented 1 year ago

I am also having this issue

zhongzheng1999 commented 9 months ago

The issue seems to persist. I am also having this issue. @donaldcampbelljr Could you do me a favor to solve the issue？Thanks!

zhongzheng1999 commented 9 months ago

@ljmills Did you ever find a solution? Thanks!

databio / pepatac