eudoraleer / scasa

SCASA: Single cell transcript quantification tool
GNU General Public License v3.0
19 stars 4 forks source link

Row names with multiple transcripts #16

Open Stephen1202-Wang opened 3 months ago

Stephen1202-Wang commented 3 months ago

Hi,

Thanks for developing this tool. I'm using scasa for isoform quantification in 10x data and i used the annotation file of Homo_sapiens_GENCODE_42 version you provided. However, in the quantification result, the rownames may contain multiple isoforms like this: image

I checked that the isoforms in the same line belongs to the same gene. I'm wondering if i did something wrong or how can i interpret these results.

nghiavtr commented 3 months ago

Hi @Stephen1202-Wang,

Thank you for using Scasa and very sorry for the late reply. The rows with multiple isoforms indicate they are paralogs which means these isoforms can not be statistically separated from each other based on their supporting reads. The details about paralogs are referred to the original study of Scasa (https://academic.oup.com/bioinformatics/article/38/5/1287/6448218).

Best wishes Nghia

Stephen1202-Wang commented 3 months ago

Hi @nghiavtr ,

Thanks for you reply!