BioinformaticsFMRP / TCGAbiolinks

TCGAbiolinks
http://bioconductor.org/packages/devel/bioc/vignettes/TCGAbiolinks/inst/doc/index.html
298 stars 112 forks source link

Missing gene annotations for exons #349

Open mschubert opened 5 years ago

mschubert commented 5 years ago

Hi,

When I download legacy exon quantification from TCGAbiolinks, the resulting SummarizedExperiment I get has the @rowRanges contain start, end, and chromosome.

It would be great if it also contained the gene annotations where each exon belongs to (like Ensembl gene ID, HGNC name)

tiagochst commented 5 years ago

Hello,

Please, could you send the query used ? Thanks!

mschubert commented 5 years ago

This is the query:

query = GDCquery(project = "TCGA-ACC", # or any other
                 data.category = "Gene expression",
                 data.type = "Exon quantification",
                 legacy = TRUE)
GDCdownload(query)
GDCprepare(query, save=TRUE, save.filename=<myfile>)

Loading this data gives:

> data
class: RangedSummarizedExperiment
dim: 239322 79
metadata(1): data_release
assays(3): raw_counts median_length_normalized RPKM
rownames(239322): chr10:100003848-100004653:+ ...
rowData names(0):
...

> data@rowRanges
GRanges object with 239322 ranges and 0 metadata columns:
                              seqnames              ranges strand
                                 <Rle>           <IRanges>  <Rle>
  chr10:100003848-100004653:+    chr10 100003848-100004653      +
  chr10:100007443-100008748:-    chr10 100007443-100008748      -
  chr10:100010822-100010933:-    chr10 100010822-100010933      -
  ...