jsh58 / Genrich

Detecting sites of genomic enrichment
MIT License
182 stars 27 forks source link

All qvalue are -1 in narrowPeak #110

Closed pengxin2019 closed 4 months ago

pengxin2019 commented 4 months ago

Hi, Thanks for developing this tool. I am using Genrich to process my ATAC data. I have 9 replicates for treatment group and 8 for control group. Here is my command to run Genrich:

Genrich \
-t $treat_dir/$t1,$treat_dir/$t2,$treat_dir/$t3,$treat_dir/$t4,$treat_dir/$t5,$treat_dir/$treat6,$treat_dir/$treat7,$treat_dir/$treat8,$treat_dir/$treat9 \
-o bowtie_Sscrofa11.sorted.by.name.no.MT.t.treat.replicates.c.notreat.replicates.narrowPeak \
-c $notreat_dir/$notreat1,$notreat_dir/$notreat2,$notreat_dir/$notreat3,$notreat_dir/$notreat4,$notreat_dir/$notreat5,$notreat_dir/$notreat6,$notreat_dir/$notreat7,$notreat_dir/$notreat8 \
-b bed.file -R PCR.duplicates -r -e MT -j -v

I was expecting to avoid IDR calculation by inspecting the q value. however, it looks like all q value are -1 in narrowPeak output file. Can you give me any suggestions?

Screenshot 2024-02-16 at 11 36 38 AM

Thanks Best Penny

malcook commented 4 months ago

Hi Penny,

I'm not the genrich author, but I thought I'd chime in since I use it to analyze atac data.

First, note that the manual reads: "If a -q threshold is not specified, q-values are not calculated (reported as -1). "

In any case, in my opinion you should not use -c, which should be reserved for ChIP-seq input controls.

Instead, You should:

Thoughts?

pengxin2019 commented 4 months ago

Hi malcook, Thanks for your kind prompt comments and pointing out i should not use -c flag which I was not aware of.

It definitely sounds promising to me. I will run Genrich for treatment group and control group using -tflag, respectively. Hopefully I will be able to see q value for each of output. Then I can merge two narrowPeak results.

By the way, do you think its convincing enough to say my replicates are consistent with each other by having q value < 0.05 in treatment/control group? I meant without calculating IDR.

I will let you know if it does not work!

Thanks again Penny

pengxin2019 commented 4 months ago

Hi, I did avoid using -c flag for my ATAC data and called peaks for treatment and control groups, respectively. Here is the code for my treatment group:

Genrich \
-t $treat_dir/$treat1,$treat_dir/$treat2,$treat_dir/$treat3,$treat_dir/$treat4,$treat_dir/$treat5,$treat_dir/$treat6,$treat_dir/$treat7,$treat_dir/$treat8,$treat_dir/$treat9 \
-o bowtie_Sscrofa11.sorted.by.name.t.treat.9.replicates.narrowPeak \
-b treat.9.replicates.bed.file -R treat.9.replicates.PCR.duplicates -r -e MT -j -v

But the qvalues are still all -1. Any comments?

Screenshot 2024-02-16 at 2 39 02 PM

Thanks Penny

jsh58 commented 4 months ago

As @malcook correctly pointed out, from the manual:

If a -q threshold is not specified, q-values are not calculated (reported as -1).

malcook commented 4 months ago

Hi again,

You should expect to get -1 q-values given that the manual reads: "If a -q threshold is not specified, q-values are not calculated (reported as -1). "

pengxin2019 commented 4 months ago

Thanks for your patience. For some reason I misread the description about q value thus I did not get the result as what I wanted. Thanks for pointing it out.

Which threshold of q values do you recommend? I tried to not set q value, set q value as 1, set q value as 0.05, respectively in my command. Here is a summary of detected peaks in narrowPeak:

Screenshot 2024-02-17 at 8 12 39 PM

I have 2 questions in terms of table above.

  1. I am surprised to see a higher number of detected peaks after setting q value cutoff as either 1 or 0.05 than not setting up q value cutoff. Can you help me to understand why it happened?
  2. whats the way to interpret the q values/ columns in narrowPeak. Here is the narrowPeak file having 0.05 as q value cutoff. But all q values (9th column) in this file are greater than 0.05 which is confusing to me. Whats the correct way to relate the q values in 9th column to the q value cutoff in the command?
Screenshot 2024-02-17 at 8 16 36 PM

Thanks for your patience

jsh58 commented 4 months ago

The original issue of q-values being -1 has been addressed, multiple times. If you have a different question, even after reviewing the README and previous issues, then please open a new Issue.

Any question, concern, or bug report about the program should be posted as an Issue on GitHub. Before posting, please check previous issues (both Open and Closed) to see if your issue has been addressed already. Also, please follow these good GitHub practices.