Xinglab / CLAM

CLIP-seq Analysis of Multi-mapped reads
GNU General Public License v3.0
28 stars 6 forks source link

Identifying enrichment at repeats #11

Closed juanb001 closed 4 years ago

juanb001 commented 4 years ago

Hi,

One of the key features of CLAM is its ability to detect read enrichment at repetitive regions. I'm going through the example you provided here on GitHub, but I don't see how I can use the tool to identify enrichment at repetitive elements like Alu or L1 elements. Can you clarify?

CLAM "peakcaller" and "permutation callpeak" rely on a Gencode GTF file, which (correct me if I'm wrong) masks repetitive region. Does CLAM still call peaks without knowing whether some regions correspond to L1, Alu, etc. ? Does the "data downloader" function, in conjunction with "peak annotator" download the UCSC RepeatMasker track in the background and identify repeats with that?

Thanks!

wkdeng commented 4 years ago

Hi, when using CLAM to call peaks with multi-mapped reads, what CLAM actually doing is assign a unique location for all multi-mapped reads with EM algorithm, them call peaks with unique mapped reads and re-assigned multi-mapped reads. So we do not actually care about whether a multi-mapped reads is in Alu or L1. So, yes to "Does CLAM still call peaks without knowing whether some regions correspond to L1, Alu, etc. ?". The data downloader will download our pre-processed data derived from UCSC RepeatMask. Contact me if there's any further question. Thanks!

juanb001 commented 4 years ago

Great, thank you!