Xinglab / espresso

Other
48 stars 4 forks source link

Where to find the splice junctions of annotated isoforms from the annotation? #60

Open LilyLuyang opened 1 week ago

LilyLuyang commented 1 week ago

Hi author,

Thanks for developing this brilliant isoform analytical tool!

May I ask where could I find the splice junctions of annotated isoforms and the types of alternative splicing events from the annotation? I'm confused about the description in method part: “Classifying alternative splicing events underlying tissue-specific transcript isoforms Using the Ensembl BioMart database (Release 106, April 2022), we obtained canonical transcripts for genes with at least one tissue-specific transcript isoform discovered at FDR < 1% from ONT 1D cDNA sequencing data for 30 human tissues.”

I checked the Ensembl BioMart database about the transcript annotation, which is similar with Gencode.

In the following, how to classify the types of alternative splicing events: "We next compared the structure of each tissue-specific transcript isoform with the structure of the canonical transcript isoform for the corresponding gene, and we classified local differences in transcript structure into basic types of alternative splicing events, including exon skipping, alternative 5′ splice site usage, alternative 3′ splice site usage, mutually exclusive exon, intron retention, alternative first exon, and alternative last exon."

I would appreciate it if you can give me some advice to understand.

Thanks a lot.

Best, Lily

EricKutschera commented 1 week ago

This is the code for determining the tissue-specific transcripts: https://github.com/Xinglab/espresso/tree/main/tissue_specific_analysis

From https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_46/gencode.v46.primary_assembly.annotation.gtf.gz you could find canonical transcripts with grep Ensembl_canonical gencode.v46.primary_assembly.annotation.gtf

Here is code for determining alternative splicing events similar to what was done in the espresso paper: https://github.com/Xinglab/rMATS-long/tree/592cb3268d16aa6bea3a6b79aedceac128563e3b?tab=readme-ov-file#classify-isoform-differences

LilyLuyang commented 1 week ago

This is the code for determining the tissue-specific transcripts: https://github.com/Xinglab/espresso/tree/main/tissue_specific_analysis

From https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_46/gencode.v46.primary_assembly.annotation.gtf.gz you could find canonical transcripts with grep Ensembl_canonical gencode.v46.primary_assembly.annotation.gtf

Here is code for determining alternative splicing events similar to what was done in the espresso paper: https://github.com/Xinglab/rMATS-long/tree/592cb3268d16aa6bea3a6b79aedceac128563e3b?tab=readme-ov-file#classify-isoform-differences

Many thanks for your quick reply!

That's very helpful for me.

Best, Lily