nservant / HiC-Pro

HiC-Pro: An optimized and flexible pipeline for Hi-C data processing
Other
382 stars 183 forks source link

Can I trimmed the 3'end reads before alignment? #452

Closed JFF1594032292 closed 3 years ago

JFF1594032292 commented 3 years ago

Hi Nicholas, I have used HiC-Pro for my analysis and it's really helpful, and now I have a question about the pipeline to consult you. I noticed that the alignment in HiC-Pro was composed of two parts: First, directly mapping the reads and the reads spanning the ligation junction can't be aligned. Then trimmed these unaligned reads and re-aligned them to the reference genome. I wonder if it would be quicker to firstly trim all the reads spanning the ligation junction by detecting the enzyme cutting sites, so we don't need to align some reads twice. Actually, the HICUP seems to trimmed reads firstly and then align them only once. I think it may only accelerate the process a little bit (maybe 1/4 or 1/5?) because no more than ~50% of reads need to trimmed normally. But I really curious if there are some special reasons to limit it. XD

Thanks! Jiang

nservant commented 3 years ago

Hi Jiang,

Yes, I agree with you. Trimming the reads makes sense (as HICUP is doing), but as you mentioned, I would not expect a huge differences in time ... which one is the fastest ? trimming reads ? or realigning a few percent of reads in a second step ? The reason for which Hi-Pro is doing this two steps mapping is mainly historical. Then, what I like in this two steps procedure, is i/ that I don't have to run the trimming :), ii/ checking the number of reads aligned at each step ... which gives a first indication if the ligation efficiency (but it should be the same than looking at the number of trimmed reads I guess).

Note that now, the BWA mapper can be used to directly map the 5' end of reads. Many people are using this option to align Hi-C reads, and I guess that the easiest way to go. This is something I would like to add to the nf-core-hic pipeline when I'll some times ! Best

JFF1594032292 commented 3 years ago

Hi, Thanks for your detailed explanation! It answers my doubts, and maybe I underestimated the time spent on trimming previously.

Best Jiang