comprna / SUPPA

SUPPA: Fast quantification of splicing and differential splicing
MIT License
263 stars 62 forks source link

ERROR:psiCalculator:PSI not calculated for event #201

Open maronem opened 1 week ago

maronem commented 1 week ago

When running psiPerEvent, I get NA values for all events. I used the GENCODE GRCh38.p14 gtf for my event annotation as well as for my transcript abundance quantification using StringTie. I have seen that most issues arise with the expression file and I have tried where the transcript ids match what is in the .ioe file and the annotation .gtf file and both give me the same PSI not calculated error. If I search for a transcript from my expression file I do get matches in the .ioe file so I am not sure why it is not calculating a PSI value for those that are within both files. Any suggestions?

.ioe file:

(SUPPA2.3) Michaels-MacBook-Pro-3:SUPPA-2.3 michael$ head GENCODE_genEvent_SE_strict.ioe
seqname gene_id event_id    alternative_transcripts total_transcripts
chr1    ENSG00000286448.2   ENSG00000286448.2;SE:chr1:267056-270855:270984-273153:+ ENST00000784886.1   ENST00000784886.1,ENST00000784885.1
chr1    ENSG00000286448.2   ENSG00000286448.2;SE:chr1:267056-267946:268020-268122:+ ENST00000784890.1   ENST00000784890.1,ENST00000669836.1
chr1    ENSG00000228794.13  ENSG00000228794.13;SE:chr1:826923-829003:829104-847654:+    ENST00000744865.1   ENST00000744870.1,ENST00000744868.1,ENST00000744865.1
chr1    ENSG00000228794.13  ENSG00000228794.13;SE:chr1:829104-847654:847806-851927:+    ENST00000744865.1,ENST00000659124.3,ENST00000449005.8,ENST00000658846.2,ENST00000744874.1,ENST00000691316.1,ENST00000692194.1,ENST00000702098.2,ENST00000691234.3,ENST00000702847.1,ENST00000744891.1,ENST00000744878.1,ENST00000657175.1,ENST00000744888.1 ENST00000744890.1,ENST00000744881.1,ENST00000658846.2,ENST00000702847.1,ENST00000744888.1,ENST00000449005.8,ENST00000623070.6,ENST00000744878.1,ENST00000744879.1,ENST00000657837.1,ENST00000744891.1,ENST00000692194.1,ENST00000623808.4,ENST00000622921.1,ENST00000744882.1,ENST00000702557.2,ENST00000659124.3,ENST00000416570.7,ENST00000744873.1,ENST00000702273.1,ENST00000702098.2,ENST00000691316.1,ENST00000445118.7,ENST00000744874.1,ENST00000691234.3,ENST00000657175.1,ENST00000688309.1,ENST00000744880.1,ENST00000685764.3,ENST00000744865.1
chr1    ENSG00000228794.13  ENSG00000228794.13;SE:chr1:847806-851927:852110-852671:+    ENST00000744865.1,ENST00000744868.1,ENST00000744869.1,ENST00000659124.3,ENST00000701598.2,ENST00000670780.1,ENST00000661237.3,ENST00000691316.1,ENST00000691234.3,ENST00000702847.1,ENST00000744878.1,ENST00000744888.1,ENST00000744894.1,ENST00000744896.1,ENST00000688420.2   ENST00000691316.1,ENST00000744868.1,ENST00000702847.1,ENST00000744888.1,ENST00000744896.1,ENST00000691234.3,ENST00000670780.1,ENST00000701598.2,ENST00000661237.3,ENST00000744878.1,ENST00000744894.1,ENST00000659124.3,ENST00000744876.1,ENST00000744869.1,ENST00000688420.2,ENST00000744865.1
chr1    ENSG00000228794.13  ENSG00000228794.13;SE:chr1:852110-852671:852766-853391:+    ENST00000744865.1,ENST00000685334.2,ENST00000744868.1,ENST00000744879.1,ENST00000744869.1,ENST00000744870.1,ENST00000659124.3,ENST00000666741.3,ENST00000685764.3,ENST00000701598.2,ENST00000445118.7,ENST00000670780.1,ENST00000688008.3,ENST00000688309.1,ENST00000685566.2,ENST00000608189.6,ENST00000448975.7,ENST00000661237.3,ENST00000686238.2,ENST00000609139.7,ENST00000609009.6,ENST00000691316.1,ENST00000691234.3,ENST00000702557.2,ENST00000702847.1,ENST00000744886.1,ENST00000744880.1,ENST00000744887.1,ENST00000744878.1,ENST00000744881.1,ENST00000744883.1,ENST00000744884.1,ENST00000744888.1,ENST00000744892.1,ENST00000744894.1,ENST00000744895.1,ENST00000744896.1,ENST00000688420.2,ENST00000744899.1,ENST00000744898.1,ENST00000744904.1   ENST00000744881.1,ENST00000744883.1,ENST00000744904.1,ENST00000744899.1,ENST00000701598.2,ENST00000744894.1,ENST00000702847.1,ENST00000744887.1,ENST00000744888.1,ENST00000623070.6,ENST00000449005.8,ENST00000744870.1,ENST00000688008.3,ENST00000744892.1,ENST00000744895.1,ENST00000448975.7,ENST00000744886.1,ENST00000744879.1,ENST00000744878.1,ENST00000685334.2,ENST00000744891.1,ENST00000666741.3,ENST00000744868.1,ENST00000744896.1,ENST00000609139.7,ENST00000686238.2,ENST00000685566.2,ENST00000702557.2,ENST00000701768.1,ENST00000661237.3,ENST00000744869.1,ENST00000608189.6,ENST00000744898.1,ENST00000691316.1,ENST00000445118.7,ENST00000691234.3,ENST00000670780.1,ENST00000744877.1,ENST00000609009.6,ENST00000659124.3,ENST00000688309.1,ENST00000744880.1,ENST00000685764.3,ENST00000688420.2,ENST00000744865.1,ENST00000744884.1
chr1    ENSG00000228794.13  ENSG00000228794.13;SE:chr1:826923-851927:852110-852671:+    ENST00000685334.2   ENST00000685334.2,ENST00000744866.1
chr1    ENSG00000228794.13  ENSG00000228794.13;SE:chr1:826923-847654:847806-851927:+    ENST00000744868.1   ENST00000685334.2,ENST00000744868.1
chr1    ENSG00000228794.13  ENSG00000228794.13;SE:chr1:826923-852671:852766-853391:+    ENST00000744866.1   ENST00000744867.1,ENST00000744866.1

expression file showing no extra tabs in header:

head TPMs.txt | cat -t
VEH_A
ENST00000831248.1^I1.793814
ENST00000831153.1^I2.304462
ENST00000831668.1^I1.410982
ENST00000416931.1^I89.168289
ENST00000457540.1^I58.015953
ENST00000414273.1^I19.674627
ENST00000427426.1^I51.452415
ENST00000467115.1^I66.111641
ENST00000416718.2^I4.859468

Here is my command when running psiPerEvent:

$ python3.11 suppa.py psiPerEvent -e TPMs.txt -i GENCODE_genEvent_SE_strict.ioe -o VEH_SE
EduEyras commented 4 days ago

Hi, it's hard to see from the data what could be wrong. Perhaps you don't have values for all transcripts so that's what's failing? E.

EduEyras commented 4 days ago

If a transcript does not have a value, it is not assumed to be a zero, the event will produce an NA.

maronem commented 1 day ago

To clarify, for each event in the respective .ioe file, such as:

chr1    ENSG00000228794.13  ENSG00000228794.13;SE:chr1:829104-847654:847806-851927:+    ENST00000744865.1,ENST00000659124.3,ENST00000449005.8,ENST00000658846.2,ENST00000744874.1,ENST00000691316.1,ENST00000692194.1,ENST00000702098.2,ENST00000691234.3,ENST00000702847.1,ENST00000744891.1,ENST00000744878.1,ENST00000657175.1,ENST00000744888.1 

Every transcript in the final column shown above needs to also be present in the TPM file, otherwise, it will produce an NA?