nuno-agostinho / psichomics

Interactive R package to quantify, analyse and visualise alternative splicing
http://nuno-agostinho.github.io/psichomics/
Other
35 stars 11 forks source link

Samples Column in Splicing Results #461

Open GuoLabUCSD opened 1 year ago

GuoLabUCSD commented 1 year ago

Hi,

I had a few quick questions about how I should interpret the "Samples ()" columns in my results file. Do these columns just refer to how many samples contained the splicing event in the SJ.out.tab files? Based on the example below, would this mean that the event "SE3+_13661331_13663275_13663415_13667945_FBLN2" was found in the SJ.out.tab file for all 112 normals, but only 178/181 of the tumor samples? If so, does this mean that the following statistical tests for differential analysis of this event are comparing the 112 normals to the 178 tumors, and the 3 tumors that do not list the event are not included at all?

image

If this is the case, is there a way to see which specific tumor samples are being included in the analysis?

Thanks in advance! -Joseph

nuno-agostinho commented 1 year ago

Hi @GuoLabUCSD,

Yes, your assumptions are correct. The samples column refers to the number of samples containing enough junction read counts to quantify the event (by default, psichomics only uses samples with at least 10 read counts supporting the event).

You can check which samples are not being used for AS quantification by going to the PSI table and looking at the samples with missing values (NA) for that event. For instance:

# Get PSI values for this event
eventPSI = psi["SE_3_+_13661331_13663275_13663415_13667945_FBLN2", ]

# Get name of samples with missing values
colnames(is.na(eventPSI))

Hope this helps, but feel free to ask me more questions if needed.

Kind regards, Nuno

GuoLabUCSD commented 1 year ago

Great, this helps! Thanks.

-Regards, Joseph