comprna / SUPPA

SUPPA: Fast quantification of splicing and differential splicing
MIT License
262 stars 62 forks source link

transcription of ioe or gtf #198

Closed XXuxi closed 1 month ago

XXuxi commented 2 months ago

I'm sorry to bother you, but I would like to have a deeper understanding of the formation mechanism of alternative transcripts and total transcripts generated from .ioe for each splicing event.

As we know, .ioe files are generated from GTF. Below, I will describe my problem in terms of an SE event.

Fig1. from .ioe

截屏2024-09-18 20 36 27

Fig2. from GTF

截屏2024-09-18 20 43 37

As shown in the figure, 45475888:45476073 is the position of the skipped exon. In the gtf file, I found that ENST00000700129.1 transcript also includes this exon, so I think it should be included in the alternative transcript. But it doesn't actually appear in alternative or total transcripts (there are many other cases like this), So I want to know how the alternative transcripts of each event and the formation mechanism of the total transcripts exactly match up. Isn't it a simple match between an event and a transcript.

This question has bothered me for a long time, looking forward to your reply.

EduEyras commented 1 month ago

Hi,

The SE event is not just defined by the alternative exon, but also by the flanking splicing sites (see figure 3 in https://github.com/comprna/SUPPA)

So if the transcript ENST00000700129.1 does not share those flanking splicing sites, it won't be included.

This condition can be relaxed using "variable boundaries", i.e. -b V, as explained in the Fig. 4.

This simply relaxes this constraint of the flanking splice-sites up to some distance, i.e. it would allow other transcripts to contribute to the inclusion of that exon even if the flanking exons do not match.

I hope this helps

best

Eduardo

On Wed, 18 Sept 2024 at 22:47, XXuxi @.***> wrote:

I'm sorry to bother you, but I would like to have a deeper understanding of the formation mechanism of alternative transcripts and total transcripts generated from .ioe for each splicing event.

As we know, .ioe files are generated from GTF. Below, I will describe my problem in terms of an SE event.

Fig1. from .ioe 2024-09-18.20.36.27.png (view on web) https://github.com/user-attachments/assets/b8bb8433-d988-41e4-b71c-95533d7ae7ad

Fig2. from GTF 2024-09-18.20.43.37.png (view on web) https://github.com/user-attachments/assets/a682f077-77c6-4be7-8bbe-4c2d5635c12d

As shown in the figure, 45475888:45476073 is the position of the skipped exon. In the gtf file, I found that ENST00000700129.1 transcript also includes this exon, so I think it should be included in the alternative transcript. But it doesn't actually appear in alternative or total transcripts (there are many other cases like this), So I want to know how the alternative transcripts of each event and the formation mechanism of the total transcripts exactly match up. Isn't it a simple match between an event and a transcript.

This question has bothered me for a long time, looking forward to your reply.

— Reply to this email directly, view it on GitHub https://github.com/comprna/SUPPA/issues/198, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADCZKB3T6FTR7WDDHQ4Y2IDZXFY7XAVCNFSM6AAAAABONTWY76VHI2DSMVQWIX3LMV43ASLTON2WKOZSGUZTGNRVGIYTEOI . You are receiving this because you are subscribed to this thread.Message ID: @.***>