jsh58 / Genrich

Detecting sites of genomic enrichment
MIT License
182 stars 27 forks source link

Different pileups same(-ish) BAM #39

Closed Maarten-vd-Sande closed 4 years ago

Maarten-vd-Sande commented 4 years ago

I wanted to check if I understood correctly the Genrich pileup, so I thought I compare the pileup of a paired-end bam from my atac-seq data, with the same bam file where I deleted the paired end information. There is a slight difference between the two which I do not understand:

image

As you can see the pileups are practically identical, except for the two bumps. I did a 'manual' pileup and it seems that the two bumps are correct. image

However I did notice that the bumps correspond to the areas where the pileup of the forward and reverse are overlapping. Does that mean that the pileup of a pair is basically binary, 1 or 0? Or am I missing something else?

jsh58 commented 4 years ago

Thanks for the question, and for the pileup visualization - it's pretty cool.

The answer can be found in the description of ATAC-seq mode:

For full fragments, when the two cut site intervals overlap, they are merged into a single interval.

It doesn't make sense to count such an overlap twice, since the two cut site intervals are derived from a single fragment. If you "fool" Genrich into thinking the cut site intervals are not from a single fragment, of course it will count the overlap twice.

Maarten-vd-Sande commented 4 years ago

Thanks for the fast reply! I missed that part from the docs, sorry!