Open j-andrews7 opened 3 years ago
Hello @j-andrews7, sorry I got the dataset already normalized and annotated with the subtypes. https://bioconductor.org/packages/release/bioc/vignettes/TCGAbiolinks/inst/doc/subtypes.html
Please check the file SuppTable1-34-TCGAID.txt
on data
folder at this repository to see if it helps you to retrieve the subtypes.
Thanks and sorry for the delay to answer you.
Similar question, seems lilke all the supplementary tables is not avalable on the ariticle web page, did I miss something?
Hi,
Im having trouble trying to creae the TCGA.RNA.Rda object. I downloaded all the data from the site and trying to combine them into one Rda object but noticed that the symbol column is missing.
The Summarized Experiment object downloaded from here does not actually contain the column "gene symbol".
##Gene expression aligned against hg38
query <- GDCquery(
project = "TCGA-GBM",
data.category = "Transcriptome Profiling",
data.type = "Gene Expression Quantification",
workflow.type = "HTSeq - FPKM-UQ",
barcode = c("TCGA-14-0736-02A-01R-2005-01", "TCGA-06-0211-02A-02R-2005-01")
)
GDCdownload(query)
data <- GDCprepare(query)
class: RangedSummarizedExperiment
dim: 56602 2
metadata(1): data_release
assays(1): HTSeq - FPKM-UQ
rownames(56602): ENSG00000000003 ENSG00000000005 ... ENSG00000281912
ENSG00000281920
rowData names(3): ensembl_gene_id external_gene_name
original_ensembl_gene_id
colnames(2): TCGA-14-0736-02A-01R-2005-01 TCGA-06-0211-02A-02R-2005-01
colData names(105): barcode patient ...
paper_Telomere.length.estimate.in.blood.normal..Kb.
paper_Telomere.length.estimate.in.tumor..Kb.
The site does say the following so I think the symbol information is not available now.
Unfortunately, some of the updates changes/remove gene symbols, change coordinates, etc. Which might introduce some loss of data. For example, if the gene was removed we cannot map it anymore and that information will be lost in the SummarizedExperiment.
Would you be able to give us directions on how to create the TCGA.RNA.Rda object?
Thanks, Rashindrie
Hello Rashindrie, thanks for bringing it to my attention.
You can download the R object containing the RNA-seq expression and sample annotation using the following link https://cedars.app.box.com/v/RNA-TCGA-Pancancer
Let me know if it works!
It works, Thanks @mabraao!
What are the chances the TCGA data used in the README or the code used to collect it from TCGAbiolinks could be made available?