Xinglab / CLAM

CLIP-seq Analysis of Multi-mapped reads
GNU General Public License v3.0
28 stars 6 forks source link

some question about CLAM #19

Closed 1409605824 closed 3 years ago

1409605824 commented 4 years ago

Hello, I have some question about CLAM. Firstly,as you explained, "--read-tagger-method "will tag a CLIP/RIP read to a particular locus; 'median' tags read center and is recommended for RIP-seq; 'start' tags read start site and is recommended for CLIP-seq. I understood that we should choose the start site when we analyse CLIP data, but can you briefly tell me how the parameter work? For example ,if we want find some motif by CLIP, we might pay more attention to the 5' first some nucleotide ,and consider little about the nucleotide far from it, such as 3' nucleotide of reads. So, I want to know when using CLAM, how it work when we choose "--read-tagger-method start" or "--read-tagger-method median" ,what's the difference between them.

And the other question is that when using "CLAM realigner" ,one of the parameter "--winsize",default 50. So, I just want to know if I set it as 100 ,or 200 ,the result is much difference between 50 ,100, 200 ,Or how to set proper size about it .And it has relationship with the read length ? I have read the paper about CLAM in NAR, which is pretty powerful. And i struggled to understand the origin code,but it might be difficult for me ,I'll appreciate it if you can reply me soon. Thanks for you.

zj-zhang commented 3 years ago

Sorry for the late response. The difference between start and median is, just as you mentioned, the particular location in a read that we tell CLAM to look at. start will ask CLAM to focus on 5' end. The window size depends on the particular data you are analyzing, e.g. the fragment size. For CLIP, I recommend using a smaller window size (50bp); while for RIP-seq which typically has broader peaks, using a wider window size is better.