Closed KKen357252 closed 1 year ago
Same problem since last Friday! Was working well on last Thursday.
Try to check the not opening URL and this message came in browser :
"The GDC Legacy Archive is no longer available. For assistance, contact the GDC Help Desk at support@nci-gdc.datacommons.io"
and check that : https://gdc.cancer.gov/news-and-announcements/gdc-legacy-archive-retires
Ah I see. If that so, are there any alternative methods to GDC query for analyzing TCGA dataset that you recommend? I am new to the field and this is in fact my first try on analyzing a TCGA dataset and with specific samples. Any recommendation would be much appreciated. Thank you in advance.
Legacy database will not exist anymore, we have been updating the package to remove legacy support since last week.
@KKen357252 The code below will acess the data aligned against hg38.
listSamples <- c(
'TCGA-D6-6516', 'TCGA-T2-A6X2', 'TCGA-BA-4077', 'TCGA-BA-A6DJ', 'TCGA-CV-5442', 'TCGA-CQ-5327',
'TCGA-CX-7086', 'TCGA-CV-6441', 'TCGA-CV-5970', 'TCGA-CV-6952', 'TCGA-T3-A92N', 'TCGA-CR-7390',
'TCGA-CV-7102', 'TCGA-QK-A8Z7', 'TCGA-4P-AA8J', 'TCGA-CR-6491', 'TCGA-BA-6871', 'TCGA-CV-6951',
'TCGA-IQ-A61G', 'TCGA-CN-6020', 'TCGA-CV-6954', 'TCGA-CV-6953', 'TCGA-C9-A47Z', 'TCGA-CV-5439',
'TCGA-QK-AA3K', 'TCGA-CV-5436', 'TCGA-CV-6003', 'TCGA-CV-5977', 'TCGA-CV-6436', 'TCGA-CV-5976',
'TCGA-F7-A61W', 'TCGA-CV-5973', 'TCGA-C9-A480', 'TCGA-CV-6433', 'TCGA-CQ-7065', 'TCGA-CV-5979',
'TCGA-CQ-A4CH', 'TCGA-CV-5971'
)
query <- GDCquery(
project = 'TCGA-HNSC',
data.category = "Transcriptome Profiling",
data.type = "Gene Expression Quantification",
workflow.type = "STAR - Counts",
barcode = listSamples
)
GDCdownload(query = query)
data <- GDCprepare(query = query)
Well you can tried directy$ly in the data portal UI.
I refer to my issue (here)https://github.com/BioinformaticsFMRP/TCGAbiolinks/issues/578
maybe the code from @tiagochst will help you
Thank you very much for the replies. Yes, the code works! Thank you once again and wishing you a good day ahead.
Hi,
My R version is 4.3.0 and my library are up-to-date. When I tried to use the GDCquery to look for data, it shows error in opening connection. The error log is printed as below. It works just fine last week, but when I tried now, it no longer works and I cannot proceed with TCGA dataset. Is there any method on how can I fix this? Thank you in advance
Below are the codes I am trying to run and the error log:
listSamples <- c('TCGA-D6-6516', 'TCGA-T2-A6X2', 'TCGA-BA-4077', 'TCGA-BA-A6DJ', 'TCGA-CV-5442', 'TCGA-CQ-5327', 'TCGA-CX-7086', 'TCGA-CV-6441', 'TCGA-CV-5970', 'TCGA-CV-6952', 'TCGA-T3-A92N', 'TCGA-CR-7390', 'TCGA-CV-7102', 'TCGA-QK-A8Z7', 'TCGA-4P-AA8J', 'TCGA-CR-6491', 'TCGA-BA-6871', 'TCGA-CV-6951', 'TCGA-IQ-A61G', 'TCGA-CN-6020', 'TCGA-CV-6954', 'TCGA-CV-6953', 'TCGA-C9-A47Z', 'TCGA-CV-5439', 'TCGA-QK-AA3K', 'TCGA-CV-5436', 'TCGA-CV-6003', 'TCGA-CV-5977', 'TCGA-CV-6436', 'TCGA-CV-5976', 'TCGA-F7-A61W', 'TCGA-CV-5973', 'TCGA-C9-A480', 'TCGA-CV-6433', 'TCGA-CQ-7065', 'TCGA-CV-5979', 'TCGA-CQ-A4CH', 'TCGA-CV-5971')
TCGA <- GDCquery(project = 'TCGA-HNSC', data.category = 'Gene expression', data.type = 'Gene expression quantification', platform = 'Illumina HiSeq', file.type = 'results', barcode = listSamples, legacy = TRUE )
Error log: o GDCquery: Searching in GDC database
Genome of reference: hg19 Error in open.connection(con, "rb") : cannot open the connection to 'https://api.gdc.cancer.gov/legacy/projects/TCGA-HNSC?expand=summary,summary.data_categories&pretty=true' In addition: Warning message: In open.connection(con, "rb") : cannot open URL 'https://api.gdc.cancer.gov/legacy/projects/TCGA-HNSC?expand=summary,summary.data_categories&pretty=true': HTTP status was '410 Gone'