jmschrei / bpnet-lite

This repository hosts a minimal version of a Python API for BPNet.
MIT License
25 stars 11 forks source link

ATAC-seq pre-processing: shifting tags #11

Closed gregorydonahue closed 2 weeks ago

gregorydonahue commented 3 weeks ago

Hi Jacob,

I'm using ChromBPnet to examine some ATAC-seq data, and I have a general question about preprocessing.

In the data processing section from the original ATAC-seq paper (Buenrostro et al, Nature Methods 2013), the authors recommend shifting aligned tags +4bp or -5bp (for reference + or - strand-aligned tags, respectively), in order to accomodate the 9bp transposon insertion. They do this in order to get more accurate footprinting - do you think this is useful as a preprocessing step for BPnet? My guess is no, but some in my lab are curious.

Thanks, Greg

jmschrei commented 3 weeks ago

We shift ATAC-seq data +4/-4 internally. I don't know where the discrepancy arose from but, apparently, +4/-5 shifts things too much. Regardless, for the purpose of training ChromBPNet models I personally don't think it's that big a deal how you shift as long as it's done consistently. There is also a shift for DNase data but I don't remember what it is off the top of my head -- the ChromBPNet repo likely has more details on this. This isn't relevant for models that are not trained on ATAC-seq or DNase-seq.

gregorydonahue commented 2 weeks ago

Thanks Jacob! Very helpful. -G