jsh58 / Genrich

Detecting sites of genomic enrichment
MIT License
183 stars 27 forks source link

About ATAC-seq #10

Closed AlexWanghaoming closed 5 years ago

AlexWanghaoming commented 5 years ago

Dear developer, I have a fastq (totally 84192124 reads) from ATAC-seq. When call peaks using that, I get about 12000 peaks with commond:Genrich -t file_sort.bam -o file.peaks -r -j Now, if I shuffle the fastq/Bam file into two replicates, I only get 2000 peaks each replicates. I do not know why, could you help me?

Thanks Alex

jsh58 commented 5 years ago

Alex,

Genrich follows a specific procedure when analyzing multiple replicates, as described here. This procedure is based on the assumption that the replicates are real, not artificially created. So it is not surprising that there are different peaks called after artificially creating two replicates.

John Gaspar

AlexWanghaoming commented 5 years ago

Hi John, Thanks for your reply. I meant that I split fastq/bam reads randomly into two files and only input one of them to Genrich so that Genrich do not know it is a replicate of the two files. However, I only get 2000 peaks with a replicate, why it is much smaller than 12000.

jsh58 commented 5 years ago

With fewer reads, there is less power to detect peaks, so that's not surprising. Of course, in general we cannot generate more reads and guarantee more peaks will be called, since there will be more artifacts (PCR duplicates).