TheJacksonLaboratory / splicing-pipelines-nf

Repository for the Anczukow-Lab splicing pipeline
14 stars 10 forks source link

Add clipping parameter #200

Closed lmurba closed 3 years ago

lmurba commented 4 years ago

Problem

rMATS has resolved a conflict with soft clipping. Currently, we have turned soft clipping in STAR #196 off to get around this issue. However, as soft clipping improve the accuracy of mapping - it would be worth testing out this new parameter (or testing a way to hard clip the soft clipped bases).

Dr. Alex Dobin (STAR creator)- "I would recommend hard-clipping the bases, as I believe the overall alignment quality is better with clipped bases. There should be tool out there, but it's not hard to do it yourself - basically, you need to extract the number of S-clipped bases from CIGAR (5' and 3'), and then trim the read sequence and quality strings for these number of bases."

Upon testing - removing soft clipping did not dramatically impact our mapping. The number of uniquely mapped reads decreased about 1-3% depending on the dataset. However, it is also known that removing soft clipping results in misalignments to pseudogenes.

Solution

When rmats new parameter is implemented, test/add parameter for our pipeline to allow for clipping

Implementation

angarb commented 3 years ago

@sk-sahu

For mapping, we will want to add back soft clipping with a variable. We want soft-clipping to be on as default. In STAR, the parameter is 'alignEndsType' and we want it to be 'Local' when soft clipping is on.

If the soft clipping is on, we will allow-clipping in rMATS