jpromeror / EventPointer

R package for the identification and statistical analysis of alternative splicing events using junction arrays or RNASeq data
4 stars 0 forks source link

EventsFound_RNASeq.txt format #12

Closed OlgaVT closed 3 years ago

OlgaVT commented 4 years ago

Hi, I tried EventPointer and in the EventsFound_RNASeq.txt file in the Gene column sometimes there are gene names, but sometimes - numbers. An example is below:

EventID Gene Event Number Event Type Genomic Position Path 1 Path 2 Path Reference 35_1 ENSG00000225905 1 Alternative Last Exon 1:1420564-1422657 1-2,2-2,2-3,3-3 1-4,4-4 1-1 82_1 82 1 Alternative Last Exon 1:6282888-6318492 3-4,4-4 3-6 1-1,1-2,2-2,2-3,3-3

Do you know, how to interpret the numbers? Or I might have problems with my input files (I am using a custom gtf annotation file, for example)?

Thank you!

jpromeror commented 4 years ago

Hi @OlgaVT,

Some of the genes show the Ensembl Gene ID instead of Gene Symbol because there isn't an "official" symbol yet. You can check that example here:

https://www.ensembl.org/Homo_sapiens/Gene/Summary?db=core;g=ENSG00000225905

As you can see it is a novel transcript.

For that reason, we decided to provide Ensembl Gene IDs instead of symbol for those cases.

Let me know if there is anything else before closing the issue.

Best regards,

Juan Pablo

OlgaVT commented 4 years ago

Hi, thank you for a quick response. Is it right that "82" in the following string means, that this transcript is also new and not annotated:

82_1 82 1 Alternative Last Exon 1:6282888-6318492 3-4,4-4 3-6 1-1,1-2,2-2,2-3,3-3

Thanks!

OlgaVT commented 4 years ago

Hi, one more question about the output. If I process several bam files, in the "EventsFound_RNASeq.txt" file will be all events together from all samples? Probably, I overlooked it in the manual. Thank you!

jpromeror commented 4 years ago

Hi @OlgaVT

Yes. The EventsFound file contains all the events from all samples. Regarding the names, if the name appears as:

82_1 82 1 Alternative Last Exon 1:6282888-6318492 3-4,4-4 3-6 1-1,1-2,2-2,2-3,3-3

It does not necessarily means that it is a novel transcript, it could be a pseudo-gene or something similar. It is best to check the location and see what is happening there.

OlgaVT commented 4 years ago

Hi, thank you for your reply!