loosolab / TOBIAS

Transcription factor Occupancy prediction By Investigation of ATAC-seq Signal
MIT License
188 stars 40 forks source link

Nfr fragments #111

Closed ikumar2000 closed 1 year ago

ikumar2000 commented 2 years ago

Hi there,

 Thank you for developing and maintaining this tool. I was wondering if using only Nfr fragments would improve the performance. Also, is there any minimum read # / library recommendation? Is it okay to combine the reps and then perform the analysis?

Thanks, IK

msbentsen commented 2 years ago

Hi IK,

For the Nfr fragments, I have not tested it yet, but my impression is that most of the fragments within open chromatin regions are going to be Nfr by design. So since footprints are searched in peak regions, the influence of non-nfr fragments should be low. How big a percentage of reads would you potentially remove (from peaks)? That might help to guide whether it will have a big influence or not, but as I said, I haven't tested this.

For combining reps, yes I would recommend this! With regards to quality, I go by the ENCODE data standards for ATAC-seq (https://www.encodeproject.org/atac-seq/), which state a requirement of 25 million non-duplicate, non-mitochondrial aligned fragments, alignment rate >80% and reads-in-peaks fraction >0.3. This is per replicate, so combining several replicates (if they are of good quality) will improve on depth and should make footprinting better.

Best, Mette