alexdobin / STAR

RNA-seq aligner
MIT License
1.83k stars 504 forks source link

thoughts on STAR development team on splice/unsplice/ambigiuous reads classification. #2127

Open mortunco opened 5 months ago

mortunco commented 5 months ago

Dear @alexdobin

This is not an issue but rather asking your thoughts about recent manuscripts about splice/unsplice/ambiguous read classification. I am sure you are reading on this but havent seen a discussion in the repository. If there has one before, I am sorry for that.

The chronology of my references might be wrong but lets use them as sources of the general concepts. (some preprints got updated)

1) (Soneson et al 2021) This publication demonstrated Kallisto/Alevin/STARsolo generated different count tables hence they generate different velocity calculation. The problem is due to low read length makes ambiguous reads very hard to classify.

2) (He et al, 2023) and (Hjörleifsson and Sullivan et al, 2024) utilized flanking k-mers to a rescue a read from ambiguity to and assign to spliced/unspliced.

The reason i am asking is both of the publications always adds STAR into their comparison so I want to hear your opinion about full transcriptome quantification. What are your thoughts on issue? Are you planning to come up with a update in the algorithm on this? Velocyto option in STAR solo is my goto option for spliced/unspliced counts but do you have suggestions to improve accuracy of the quantification with STAR ?

Thank you for maintaining STAR. Its very easy to use which makes it extremely useful. Thank you very much for your time,

Best, T.

Yenaled commented 5 months ago

Hello! I am a primary author on one of the listed preprints (and am an avid STARsolo user+fan too despite much of my work being done using pseudoalignment) -- thanks for reading my preprint! :) And thanks to Dr. Dobin for his great work on STAR of course. I am posting here to bookmark this discussion and to potentially engage in it further.