Closed DiogoVeiga closed 6 years ago
By design - the GTF accepted by maser should be a GRanges object, containing all the exons in the transcriptome. The GRanges has to contain the following metadata:
"seqnames", "start", "end", "strand", "exon_id", "transcript_name"
Exons will be grouped into transcripts using "transcript_name" AnnotationHub objects automatically have this information.
The package will accept rMATS results based on any GTF, i.e. the analysis can start based on any GTF. However, mapping of events to transcripts will use GTFs from Ensembl or Gencode hg38 assembly. These can be downloaded (readGFF) or retrieved in AnnotationHub.
Therefore we recommend using Ensembl or Gencode hg38 assembly to keep consistency.
R code for the following: Download appropriate GTF from AnnotationHub (GRanges) Filter exons and transcripts - rMATS use only these features of the GTF to detect alternative splicing. Filter protein coding transcripts. Export GTF to a file using rtracklayer.