Open ThieryDSB opened 10 months ago
Referring to your concern on small PSI values, I would recommend performing differential splicing analysis and check the volcano plot. Were there any differentially spliced junctions detected? The PSIs are usually small due to drop-outs, especially for SJs located away from the 3' (or 5') end (see "Aggregated figures" section of the tutorial: https://wenweixiong.github.io/MARVEL_Droplet.html). In theory, including introns in the count matrix lower PSI values because the total gene count (denominator) now includes more non-SJ reads, but this shouldn't affect the ability to detect differentially spliced junctions, i.e., the relative difference in PSI values between 2 groups of cells.
Thank you for your quick answer. I observed differentially spliced junctions between my various conditions, aligning with my initial hypothesis. However, in conditions where the PSI is lower for a specific differentially spliced junction, the PSI often hovers around 0. Therefore, I am considering adjustments to my methods to enhance spliced junction detection and increase PSI values. This could potentially reveal larger delta PSI values between my different conditions. I appreciate your insights, which may shed light on the observed low PSI values in my experiments.
Regarding my follow-up question on intron inclusion, I'm curious whether the normalized matrix count has an impact on the calculated PSI values, or if it is solely used for the analysis of differential gene expression
The normalised gene expression matrix is solely used for differential gene expression analysis while the raw gene count matrix is used to compute PSI.
Thanks for your answer,
Since only the raw count matrix for gene and SJ generated by STARSolo are used for the calcul of PSI, and since I ran StarSolo with the option --soloFeatures Gene SJ, it is unlikely that counting pre-mRNAs have contributed to low PSI values in my analysis.
Since the PSI are usually small due to dropouts, I may try to sequence my sample to increase sequencing depth.
Sincerely
Thiéry
Hi,
Thank you for developing this useful tool. I successfully ran the main pipeline from the DropSeq vignette with my data and mostly observed the expected results.
However, I noticed that the PSI calculated are small for most SJ, being usually below 20%, which I doubt is biologically possible or representative. I was wondering if you could provide any insights to investigate further this issue or which steps in the methods or the data I should look into.
On a related note, when I ran Cellranger, I included introns in the count matrix used for data normalization. Is this correct? I reasoned that since the PSI is calculated from the raw gene and SJ count matrix generated by STARSolo, I would not introduce bias by including pre-mRNA counts in the calculation of PSI.
Sincerely,
Thiéry