jsh58 / Genrich

Detecting sites of genomic enrichment
MIT License
182 stars 27 forks source link

How are summit p values calculated? #38

Closed AlexBlais74 closed 4 years ago

AlexBlais74 commented 4 years ago

Hello I would like a clarification about the p values reported by Genrich. I understand that the initial peak identification involves base pair-wise p-value calculations (followed by AUC calculation). Then in the final .narrowpeak file, "summit p-values" are also given. How are these peak-wise p-values calculated? Do they represent the p value (initially calculated during peak identification) at the base pair that corresponds to the summit of the peak (tallest pile-up) in the definitive peak call? Thanks Alex

jsh58 commented 4 years ago

Alex,

Thanks for the question. The documentation is not very clear, but the answer can be found in the description of the -o <file>:

Summit position (0-based offset from chromStart): the midpoint of the peak interval with the highest significance (the longest interval in case of ties)

That is, the summit is the peak interval with the highest significance (based on either p- or q-value). This is not necessarily the same as the peak interval with the greatest pileup.

It may help to produce a -f <file> and see what values are used.

John Gaspar