Closed ppnin closed 5 years ago
Pavle,
Thanks for the question.
Assume that we have 2 replicates as a treatment input (without control) and Genrich detects a peak on similar intervals in both BAM files.
Let me stop you there. With multiple input files, Genrich does not call peaks from the files separately. Instead, it computes p-values for each and then combines them, as described here.
A log file (-f
) should clarify why peak_0
and the other peaks were called when analyzing the replicates together.
John Gaspar
John,
Yes, thanks for pointing that out - checking the documentation again has cleared up my confusion.
Pavle
Hi,
Thank you very much for developing Genrich, I use it with ATAC-seq data and it's been very helpful!
I have a question regarding peak calling with multiple input BAM files (biological replicates), when Genrich calculates combined p-values. Assume that we have 2 replicates as a treatment input (without control) and Genrich detects a peak on similar intervals in both BAM files. If the combined p-value indicates statistical significance - which peak interval is reported in the resulting narrowPeak file?
For example, this is the
narrowPeak
file that I get when I use Genrich only with the first BAM,and this is the output when only the second BAM is used:
When I use them together as a treatment input the result is
Looking at the peak coordinates, I see that
peak_4
in the thirdnarrowPeak
should correspond to thepeak_1
in the first and secondnarrowPeak file
, whilepeak_0
is detected on coordinates where we don't have a peak when using separate BAM files.In my understanding (section “Peak-calling method” in the documentation) Genrich first calculates AUC for statistically significant regions, and then defines peaks as regions whose AUC is above a threshold. How is AUC calculated when using combined p-values?
Thanks!
Pavle