jsh58 / Genrich

Detecting sites of genomic enrichment
MIT License
187 stars 27 forks source link

ATAC-seq 0 peaks found #114

Open chklee30 opened 1 week ago

chklee30 commented 1 week ago

Very strange behavior. I have four ATAC-seq samples across two experimental conditions (two each). For three of the samples, Genrich identifies ~10-30k peaks, but for the other one, it identifies 0. Here is the output message:

  BAM records analyzed:   182632586
    Paired alignments:    182632586
      "orphan" alns:            726 ** Warning! **
    Unpaired alignments:          0
  Fragments analyzed:      91315930
    Full fragments:        91315930
    ATAC-seq cut sites:   182631860
      (expanded to length 100bp)
- control file #0 not provided -
  Background pileup value: 3.202677
Peak-calling parameters:
  Genome length: 3099905972bp
  Significance threshold: -log(q) > 1.301
  Min. AUC: 200.000
  Max. gap between sites: 100bp
Peaks identified: 0 (0bp)

I've tried increasing the q-value threshold from 0.05 to 0.1, and also lowering the AUC threshold from 200 to 100, but in both cases, Genrich still fails to identify any peaks for that particular sample only.

I have also generated genome browser tracks from the bam files and looked at various genomic regions of interest, and qualitatively the read pileups look pretty much identical to the other sample in the same experimental condition (as expected).

All the samples were processed the exact same way.

Any ideas how I can further diagnose this problem? I'd be willing to send over data files, but not sure how to do that here given the .bam, .bed, and .bdg files are all on the order of gigabytes and the bigwig track for the genome browser is on the order of hundreds of MBs.

Would appreciate any thoughts.

P.S - I have tried also running other peak callers (macs2 and hmmratac) - both output comparable number of peaks for the samples in each experimental condition. It is only Genrich out of the three that exhibits this strange behavior.

jsh58 commented 1 week ago

I recommend you check the -f <file> output to see why Genrich is not calling a peak in any area where you think it should. You can check previous issues for additional discussion (#4, #11, #80 for starters).