jsh58 / Genrich

Detecting sites of genomic enrichment
MIT License
182 stars 27 forks source link

Background pileup value and AUC cutoff #83

Closed TeiturAK closed 2 years ago

TeiturAK commented 2 years ago

Hi,

I am analyzing ATAC-seq data and we are comparing the results from MACS2 with Genrich. With Genrich we are seeing much fewer peaks than expected and from looking at other issues more reads can affect the number of peaks called, #4 #33.

Although I can certainly decrease the AUC as has been suggested for other issues I do not understand how to apply the background pileup value to decide on this cutoff.

The output: BAM records analyzed: 120831496 Unmapped: 29650966 Supp./dups/lowQual: 1581748 Paired alignments: 85048542 Unpaired alignments: 4550240 Fragments analyzed: 42524271 Full fragments: 42524271 ATAC-seq cut sites: 85048542 (expanded to length 100bp) control file #0 not provided - Background pileup value: 17.196791 Peak-calling parameters: Genome length: 408834716bp Significance threshold: -log(p) > 2.000 Min. AUC: 100.000 Max. gap between sites: 100bp Peaks identified: 8064 (3165912bp)

Advice on how to use the background pileup value would be much appreciated.

jsh58 commented 2 years ago

Thanks for the question. There is no formula to calculate an optimal minimum AUC from a background pileup value. My recommendation is that you explore the effects of altering the peak-calling parameters (including -a) via the -X and -P arguments.