Open pcantalupo opened 3 years ago
Hello, I'm trying to import the following very simple GTF file with 1 line (comes from Gencode v38 GTF here) that has multiple tag attribues:
tag
$ cat multipletag.gtf chr6 HAVANA transcript 10723070 10731127 . + . gene_id "ENSG00000111843.14"; transcript_id "ENST00000229563.6"; gene_type "protein_coding"; gene_name "TMEM14C"; transcript_type "protein_coding"; transcript_name "TMEM14C-201"; level 2; protein_id "ENSP00000229563.5"; transcript_support_level "1"; hgnc_id "HGNC:20952"; tag "basic"; tag "Ensembl_canonical"; tag "MANE_Select"; tag "appris_principal_1"; tag "CCDS"; ccdsid "CCDS4514.1"; havana_gene "OTTHUMG00000014242.2"; havana_transcript "OTTHUMT00000039829.2";
When I import it into R, only the last tag attribute, CCDS, is parsed:
import
CCDS
> library(rtracklayer) > gtf = import("~/tmp/multipletag.gtf") > gtf GRanges object with 1 range and 18 metadata columns: seqnames ranges strand | source type score phase <Rle> <IRanges> <Rle> | <factor> <factor> <numeric> <integer> [1] chr6 10723070-10731127 + | HAVANA transcript NA <NA> gene_id transcript_id gene_type gene_name transcript_type <character> <character> <character> <character> <character> [1] ENSG00000111843.14 ENST00000229563.6 protein_coding TMEM14C protein_coding transcript_name level protein_id transcript_support_level <character> <character> <character> <character> [1] TMEM14C-201 2 ENSP00000229563.5 1 hgnc_id tag ccdsid havana_gene havana_transcript <character> <character> <character> <character> <character> [1] HGNC:20952 CCDS CCDS4514.1 OTTHUMG00000014242.2 OTTHUMT00000039829.2 ------- seqinfo: 1 sequence from an unspecified genome; no seqlengths >
How do I get rtracklayer to preserve all tag attributes for each GTF line?
Hi, I experience the same issue. See my related issue. I used read_gtf() from valr that further depends on functionality from rtracklayer.
Hello, I'm trying to import the following very simple GTF file with 1 line (comes from Gencode v38 GTF here) that has multiple
tag
attribues:When I
import
it into R, only the lasttag
attribute,CCDS
, is parsed:How do I get rtracklayer to preserve all
tag
attributes for each GTF line?