Open SamBryce-Smith opened 2 years ago
Hi @SamBryce-Smith and @faricazjj, CSI-UTR's differential analysis output reports the PSI values for both of the samples in a 0-1 format, which do not need to be divided by 100. Should I split the differential analysis output reports into two quantification beds for each of the samples?
Here is the example differential analysis output file from CSI-UTR/TestCases.md
CSI ENSGENE GENE_SYM PSI1 (LOAD) PSI2 (Control) deltaPSI (LOAD-Control) P-value FDR
ENSG00000189241:116278517_116277027-116276921 ENSG00000189241 TSPYL1 0.068122 0.089263 -0.021141 5e-05 0.00679553001277139
ENSG00000189241:116278517_116277126-116277027 ENSG00000189241 TSPYL1 0.086873 0.109091 -0.022218 0.000114 0.012597769470405
ENSG00000189241:116278517_116278517-116277246 ENSG00000189241 TSPYL1 0.505286 0.456247 0.049039 0 0
ENSG00000100796:91458759_91458759-91458147 ENSG00000100796 PPP4R3A 0.580057 0.443966 0.136091 0.000605 0.0425812764550264
ENSG00000174684:66346049_66345714-66345577 ENSG00000174684 B4GAT1 0.314256 0.342451 -0.028195 0.000374 0.0302434133738602
ENSG00000174684:66346049_66346049-66345844 ENSG00000174684 B4GAT1 0.215083 0.177637 0.037446 0 0
ENSG00000119314:112223851_112219214-112219045 ENSG00000119314 PTBP3 0.134671 0.074944 0.059727 3e-06 0.000676385593220339
ENSG00000196652:99532247_99532615-99532684 ENSG00000196652 ZKSCAN5 0.036234 0.11213 -0.075896 6.2e-05 0.00795888540410133
ENSG00000126785:63291182_63291740-63291852 ENSG00000126785 RHOJ 0.047283 0.178141 -0.130858 0.000583 0.0415550529135968
ENSG00000115310:54973156_54972313-54972195 ENSG00000115310 RTN4 0.08397 0.101558 -0.017588 0 0
ENSG00000115310:54973156_54972352-54972313 ENSG00000115310 RTN4 0.144597 0.174865 -0.030268 0 0
ENSG00000115310:54973156_54972948-54972890 ENSG00000115310 RTN4 0.089743 0.076191 0.013552 6.6e-05 0.00830211347517731
Thanks!
@yuukiiwa Thanks for looking into this! :D From what I understand from the output we could split the differential analysis output into two quantification beds for each of the condition. But I'm going to tag @mrgazzara here for extra input :p I have 2 questions!
I will have to look into this a little bit further. The usual way to get individual sample quantification with tools like this that require multiple conditions (because they're more focused on differential) is to run it with the same sample against itself. The requirement to also have a replicate might be a dealbreaker. I need to read the paper to see.
@mrgazzara When i implemented it I think I tried running it with the same sample against itself but naming the conditions different, and the replicates were the same sample and I also named the "replicates" differently but it errored out. The only way I could run it was if the replicates were distinct
Parent issue - https://github.com/iRNA-COSI/APAeval/issues/382
Per updated execution workflow output specifications, We need CSI-UTR to report the per-PAS fractional relative usage in a format 04 BED file.
CSI-UTR calculates a number of relative usage metrics, but the one that fits the format 04 convention is the 'PSI' metric which is equivalent to the percent a polyA site is used relative to total expression of PAS in that gene/terminal exon. If reported on the % scale (0-100) this needs to be converted to a fraction by dividing by 100.