I am developing a Snakemake pipeline to run ATACseq with Genrich (either single or merged replicates). Still working on it, so not everything is available yet.
My problem is that, when I call peaks with single replicates I get less replicates (in total: 17360+12992) than when calling peaks using several replicates at the same time (37517 peaks). Wouldn't have it to be the contrary?
Here's my snakemake rule for Genrich with different replicates:
rule genrich_merge:
input:
t = lambda w: expand("results/02_bam/{condition}.sorted.bam", condition = SAMPLES.loc[SAMPLES["CONDITION"] == w.condition].NAME),
output:
"results/03_genrich/merged/{condition}/{condition}_peaks.narrowPeak",
params:
# path to genrich
genrich = config["params"]["genrich"]["path"],
# pe/se parameters
pe_se = lambda w: config["params"]["genrich"]["se"] if config["params"]["genrich"]["pe_or_se"] == "se" else "",
# pvalue/qvalue threshold
p_or_q = config["params"]["genrich"]["p_or_q"],
pqval = config["params"]["genrich"]["pqval"],
# readme file
readme = "results/03_genrich/merged/{condition}/readme.txt",
threads: 5
log:
"results/00_log/genrich_merge/{condition}_peakcalling.log",
shell:
"""
{params.genrich} -j \
-t '{input.t}' \
-o {output} \
-p 4e-5 \
-y -d 150 2>> {log}
echo 'This peaks have been called using more than 1 replicates ({input}) for each condition, the parameters {params.pe_se} and a {params.p_or_q}value threshold of {params.pqval}.' > {params.readme}
"""
Other questions:
Would you recommend to downsample bamfiles from different replicates (or even different conditions) before calling peaks?
How many peaks would you expect on average for a normal sample, and which p-value/q-value threshold you would use? Is there some kind of "equivalence" between Genrich and Macs2 p-values?
Hello,
I am developing a Snakemake pipeline to run ATACseq with Genrich (either single or merged replicates). Still working on it, so not everything is available yet.
My problem is that, when I call peaks with single replicates I get less replicates (in total: 17360+12992) than when calling peaks using several replicates at the same time (37517 peaks). Wouldn't have it to be the contrary?
Here's my snakemake rule for Genrich with different replicates:
Other questions:
Thank you.