jsh58 / Genrich

Detecting sites of genomic enrichment
MIT License
182 stars 27 forks source link

interpretation of q value #111

Open pengxin2019 opened 4 months ago

pengxin2019 commented 4 months ago

Hi John, Hello again. I wonder which threshold of q values do you recommend in the commend line? I tried to not set q value, set q value as 1, set q value as 0.05, respectively in my command. Here is a summary of detected peaks in narrowPeak:

Screenshot 2024-02-18 at 12 20 00 PM

I have 2 questions in terms of table above.

I am surprised to see a higher number of detected peaks after setting q value cutoff as either 1 or 0.05 than not setting up q value cutoff. Can you help me to understand why it happened?

whats the way to interpret the q values/ columns in narrowPeak. Here is the narrowPeak file having 0.05 as q value cutoff. But all q values (9th column) in this file are greater than 0.05 which is confusing to me. Whats the correct way to relate the q values in 9th column to the q value cutoff in the command?

Screenshot 2024-02-18 at 12 20 43 PM

Thanks for your patience Penny

malcook commented 4 months ago

I recommend q=0.5

column 9 is not q.

Read the manual!

You will see it is:

9. qValue | Summit -log10(q-value), or -1 if not available (e.g. without -q)
pengxin2019 commented 4 months ago

Hi Malcook again, Thanks for pointing out that I missed the log calculation to get the correct q value. column 9 makes sense to me now. For example, there is a -15 in column 9 which can be translated to log10(q-value) = -15, so q= 10^-15 = 1e-15

My mapped reads for ATAC-seq are almost close to the ENCODE recommended 25M (https://www.encodeproject.org/atac-seq/) but a little bit lower (24.2M). So I wanted to show that my replicates are concistent with each other well though reads number is slightly lower than that of 25M. Do you think its powerful enough to convince audience using the q cutoff 0.5 in this case (I am not sure whats your reason to pick 0.5)? -log10(qvalue) = -log10(0.5) = 0.30 which will be reflected in 9th column.

Best Penny

malcook commented 4 months ago

Oops, I meant to type 0.05 as recommended q-value.

Choice of q-value is intended to control FDR.

What it will take to convince your audience of anything is beyond my ability ;)

pengxin2019 commented 4 months ago

Thanks for clarification! I appreciate it!