wososa / PSI-Sigma

PSI-Sigma
Other
35 stars 10 forks source link

Some questions about “Event Type” #67

Closed happypiggyzjx closed 3 months ago

happypiggyzjx commented 3 months ago

Hi Woody, Me again! I've recently started working on the PSI-Sigma output, but have the following questions I'd like to ask: (based on the file PSIsigma_r10_ir3.sorted.txt)

  1. In the “Event Type” column of the file, what event types do MXS and MES represent respectively? Also, I observe that some of the results are preceded by a prefix like “TSS|”, what does this mean? The way I'm thinking of handling this is to treat both MXS and MES events as Mutually Exclusive Exons; remove all the “TSS|”-like prefixes, and use the rest as event types. Is this the right way to handle it? (In the meantime, I'd like to ask you to look at the other acronyms to see if I'm understanding them correctly: SES: Exon Skipping; IR: Intron Retention; A3SS/A5SS: Alternative 3' / 5' Splice Sites.)

  2. I found that the only specific coordinates in the file are “Event Region” and “Target Exon” columns, totaling 4 loci. I'm not sure which one of these coordinates represents, can you please help me to explain? (The following is my own speculation, but I don't know if this is really the case, for the sake of convenience please allow me to draw a simple sketch to show that the black dots in the figure are the specific locations of the coordinates that I think are represented in the file) psisigma_event

Best, happypiggy

wososa commented 3 months ago

Hi @happypiggyzjx ,

Glad to see your questions.

  1. MXS = mutually exclusive events. When PSI-Sigma finds two single-exon-splicing (SES) events who share the same Event Region, PSI-Sigma will change SES definition to MXS for these events. MES = multiple-exon-splicing events. Every MES event will have multiple alternative exons in the middle, and every alternative exon will be reported separately as an individual MES records with the same DB_id (database ID) prefix. For example, 12_109637927_109645261_W_ENSMUST00000181527_1 means the first exon of an MES event, whereas 12_109637927_109645261_W_ENSMUST00000181527_2 means the second exon of an MES event. TSS| prefix means transcriptional start/stop side. It was a request from a user who wants to distinguish whether the event is from the first or the last exon. It is simply a metadata information, so feel free to remove it.
  2. Your drawing perfectly explains the coordinates of SES, IR, A3SS, and A5SS events. Very impressive. For MES and MXS events, the Event Region indicates the "longest intron" (first base of the first intron to the last base of the last intron). Target Exon indicates the coordinates of one of the alternative exons. The idea is that every exon has its own record, so that it's easier to capture exon sequences for downstream analysis.

Let me know if I didn't explain anything unclearly.

Best, Woody

happypiggyzjx commented 3 months ago

Hi Woody,

I have to admit that your explanations are really clear, and I must thank you again for your detailed, patient and quick replies, and I will continue to do my research on the subject. Cheers together! Have a wonderful day!

Best, happypiggy

wososa commented 3 months ago

@happypiggyzjx ,

I am happy to see your reply. You are very welcome. Have a wonderful weekend!

Cheers, Woody