ENCODE-DCC / chip-seq-pipeline2

ENCODE ChIP-seq pipeline
MIT License
234 stars 123 forks source link

Fixed peak size #294

Open kbattenb opened 1 year ago

kbattenb commented 1 year ago

Describe the bug

Peak size appears to be fixed with a few exceptions. Of the 15598 idr.optimal peaks, all but 13 have the size 740bp. I would like to know if this is expected or if there is some mistake I am making in the settings.

OS/Platform

Caper configuration file

backend=local

# Hashing strategy for call-caching (3 choices)
# This parameter is for local (local/slurm/sge/pbs) backend only.
# This is important for call-caching,
# which means re-using outputs from previous/failed workflows.
# Cache will miss if different strategy is used.
# "file" method has been default for all old versions of Caper<1.0.
# "path+modtime" is a new default for Caper>=1.0,
#   file: use md5sum hash (slow).
#   path: use path.
#   path+modtime: use path and modification time.
local-hash-strat=path+modtime

# Local directory for localized files and Cromwell's intermediate files
# If not defined, Caper will make .caper_tmp/ on local-out-dir or CWD.
# /tmp is not recommended here since Caper store all localized data files
# on this directory (e.g. input FASTQs defined as URLs in input JSON).
local-loc-dir=

cromwell=/home/plantsymbiosis/.caper/cromwell_jar/cromwell-52.jar
womtool=/home/plantsymbiosis/.caper/womtool_jar/womtool-52.jar

Input JSON file

{
    "chip.title" : "NIN",
    "chip.description" : "Run on station.",

    "chip.pipeline_type" : "tf",
    "chip.aligner" : "bowtie2",
    "chip.align_only" : false,
    "chip.true_rep_only" : false,

    "chip.genome_tsv" : "/home/plantsymbiosis/Desktop/TOOLS/GenomeReferences/LotusJaponicusGifuv12_NucMitoChloro/atac_gifu12/LotusJaponicusGifuv12.tsv",

    "chip.paired_end" : false,
    "chip.ctl_paired_end" : false,

    "chip.fastqs_rep1_R1" : ["03_No_Mito_Chloro/chip.fastqs_sample.rep1_R1_NoMtCh.fq.gz"],
    "chip.fastqs_rep1_R2" : [],
    "chip.fastqs_rep2_R1" : [],
    "chip.fastqs_rep2_R2" : [],
    "chip.fastqs_rep3_R1" : [],
    "chip.fastqs_rep3_R2" : [],
    "chip.fastqs_rep4_R1" : [],
    "chip.fastqs_rep4_R2" : [],
    "chip.fastqs_rep5_R1" : [],
    "chip.fastqs_rep5_R2" : [],
    "chip.fastqs_rep6_R1" : [],
    "chip.fastqs_rep6_R2" : [],
    "chip.fastqs_rep7_R1" : [],
    "chip.fastqs_rep7_R2" : [],
    "chip.fastqs_rep8_R1" : [],
    "chip.fastqs_rep8_R2" : [],
    "chip.fastqs_rep9_R1" : [],
    "chip.fastqs_rep9_R2" : [],

    "chip.ctl_fastqs_rep1_R1" : ["03_No_Mito_Chloro/chip.fastqs_background.rep1_R1_NoMtCh.fq.gz"],
    "chip.ctl_fastqs_rep1_R2" : [],
    "chip.ctl_fastqs_rep2_R1" : [],
    "chip.ctl_fastqs_rep2_R2" : [],
    "chip.ctl_fastqs_rep3_R1" : [],
    "chip.ctl_fastqs_rep3_R2" : [],
    "chip.ctl_fastqs_rep4_R1" : [],
    "chip.ctl_fastqs_rep4_R2" : [],
    "chip.ctl_fastqs_rep5_R1" : [],
    "chip.ctl_fastqs_rep5_R2" : [],
    "chip.ctl_fastqs_rep6_R1" : [],
    "chip.ctl_fastqs_rep6_R2" : [],
    "chip.ctl_fastqs_rep7_R1" : [],
    "chip.ctl_fastqs_rep7_R2" : [],
    "chip.ctl_fastqs_rep8_R1" : [],
    "chip.ctl_fastqs_rep8_R2" : [],
    "chip.ctl_fastqs_rep9_R1" : [],
    "chip.ctl_fastqs_rep9_R2" : [],

    "chip.always_use_pooled_ctl" : true
}

Troubleshooting result

The pipeline runs without errors.
kbattenb commented 1 year ago

Let me follow up on my last comment to show an example. You will see 5 peaks (in overlapping sets of 3 and 2) and each peak is exactly 740bps. Is that expected for the peaks to (1) overlap and (2) be of exact same length?

igv_snapshot