Open luisgls opened 2 years ago
Hi,
Yes, GDC removed Gene Level Copy Number Scores
data from the website. We only have the following data types now.
You can find more info in GDC documentation: https://docs.gdc.cancer.gov/Data/Bioinformatics_Pipelines/CNV_Pipeline/ https://docs.gdc.cancer.gov/Data/Bioinformatics_Pipelines/DNA_Seq_Variant_Calling_Pipeline/#whole-genome-sequencing-variant-calling
Thanks, In the same issue, when I do
query4<-GDCquery(project = "TCGA-UCEC", data.category = "Copy Number Variation")
getResults(query4)
I still see among data types the following "Gene Level Copy Number", but when I tried the GDCquery with that data type is not working either.
query5 <-GDCquery(project = "TCGA-STAD",data.category = "Copy Number Variation",data.type ="Gene Level Copy Number")
Is this datatype also removed? from the GDC documentation they have the gene level ASCAT values but again, I cant access them through GDCquery.
Error in checkDataTypeInput(legacy = legacy, data.type = data.type) :
Please set a data.type argument from the column harmonized.data.type above
Can you check you have the latest version from GitHub:
You can install with the following commands:
BiocManager::install("BioinformaticsFMRP/TCGAbiolinksGUI.data")
BiocManager::install("BioinformaticsFMRP/TCGAbiolinks")
Restart R and run:
STAD <- GDCquery(
project = "TCGA-STAD",
data.category = "Copy Number Variation",
data.type = "Gene Level Copy Number"
)
GDCdownload(STAD,files.per.chunk = 50)
gene.level.copy.number <- GDCprepare(STAD)
Hi I am also facing the same issue:
query <- GDCquery(project = "TCGA-GBM",
data.category = "Copy Number Variation",
data.type = "Gene Level Copy Number",
access="open",
legacy = F)
GDCdownload(query)
cnv_data <- GDCprepare(query)
> cnv_data
class: RangedSummarizedExperiment
dim: 60623 542
metadata(1): data_release
assays(3): copy_number min_copy_number max_copy_number
rownames(60623): ENSG00000223972.5 ENSG00000227232.5 ...
ENSG00000182484.15_PAR_Y ENSG00000227159.8_PAR_Y
rowData names(2): gene_id gene_name
colnames(542): TCGA-12-0615-01A-01D-0310-01,TCGA-12-0615-10A-01D-0310-01
TCGA-14-1456-01B-01D-0784-01,TCGA-14-1456-10A-01D-0784-01 ...
TCGA-06-0133-01A-02D-0214-01,TCGA-06-0133-10A-01D-0214-01
TCGA-06-0140-01A-01D-0214-01,TCGA-06-0140-10A-01D-0214-01
colData names(108): barcode patient ...
paper_Telomere.length.estimate.in.blood.normal..Kb.
paper_Telomere.length.estimate.in.tumor..Kb.
How do I extract the copy_number information from this? I also cannot see the information on cytoband
- which was available before and is useful.
You should be able to access the information with the code below:
library(SummarizedExperiment)
info <- rowRanges(cnv_data)
copy_number <- SummarizedExperiment::assay(cnv_data,"copy_number")
min_copy_number <- SummarizedExperiment::assay(cnv_data,"min_copy_number")
max_copy_number <- SummarizedExperiment::assay(cnv_data,"max_copy_number")
You can also set summarizedExperiment = F
Thank you @tiagochst - any idea how to access cytoband info or should I get that from a separate resource like biomart?
Hi!,
3 weeks ago I downloaded some copy number data using the following code:
Now, I'm trying tor repeat the analysis and suddenly it complains now about the data category. Any thoughts?