Open YichaoOU opened 3 years ago
polyT filtering is usually not necessary. The percentage of reads has polyT varies case by case though. I haven't quantified it. I should look into it.
When sequencing by next-seq 150 cycle kit, we only have 30bp on Read2, which includes 10bp UMI, 15bp TTTT, and only 5bp mRNA reads. 5bp is too short to be properly aligned. So I only used R1 to align. When using Nova-seq, we could get longer R2 and that could help alignment as you saw. To keep it simple, I only aligned R1 in the script.
Hello @masai1116,
Thank you again for sharing the code! I have some questions regarding RNA-seq reads.
In our test SHARE-seq data, if we filter R2 by polyT,
TTTTTT + 1 mismatch
, about 50% reads will be discarded. Is it normal?What happens if we didn't filter reads by polyT? Because for bulk RNA-seq and 10x scRNA-seq, we didn't filter polyT, right?
One last quesiton, in your STAR alignment code, you only used R1. I found my mapping rate for using only R1 is just 50%, but with paired-end mapping, I can get 70%. Is it normal?
Thanks, Yichao