BioinformaticsFMRP / TCGAbiolinks

TCGAbiolinks
http://bioconductor.org/packages/devel/bioc/vignettes/TCGAbiolinks/inst/doc/index.html
286 stars 109 forks source link

Error with GDCquery for CPTAC-3 #569

Closed SuiYinG2000 closed 1 year ago

SuiYinG2000 commented 1 year ago

Hi, I try to download RNAseq data from project CPTAC-3 through the following codes using the latest version 2.27.2 of TCGAbiolinks, but there is an error that always occurs. However, when I use the same code for another project like CPTAC-2 or TCGA-GBM, the error won't come up again.

CODE: query <- GDCquery( project = "CPTAC-3", data.category = "Transcriptome Profiling", data.type = "Gene Expression Quantification", experimental.strategy = "RNA-Seq", workflow.type = "STAR - Counts" )

ERROR:

--------------------------------------

o GDCquery: Searching in GDC database

--------------------------------------

Genome of reference: hg38

--------------------------------------------

oo Accessing GDC. This might take a while...

--------------------------------------------

ooo Project: CPTAC-3 Error: Error in getURL(url, fromJSON, timeout(600), simplifyDataFrame = TRUE): 'getURL()' failed: URL: https://api.gdc.cancer.gov/files/?pretty=true&expand=cases,cases.samples.portions.analytes.aliquots,cases.project,center,analysis,cases.samples&size=7266&filters=%7B%22op%22:%22and%22,%22content%22:[%7B%22op%22:%22in%22,%22content%22:%7B%22field%22:%22cases.project.project_id%22,%22value%22:[%22CPTAC-3%22]%7D%7D,%7B%22op%22:%22in%22,%22content%22:%7B%22field%22:%22files.experimental_strategy%22,%22value%22:[%22RNA-Seq%22]%7D%7D,%7B%22op%22:%22in%22,%22content%22:%7B%22field%22:%22files.data_category%22,%22value%22:[%22Transcriptome%20Profiling%22]%7D%7D,%7B%22op%22:%22in%22,%22content%22:%7B%22field%22:%22files.data_type%22,%22value%22:[%22Gene%20Expression%20Quantification%22]%7D%7D,%7B%22op%22:%22in%22,%22content%22:%7B%22field%22:%22files.analysis.workflow_type%22,%22value%22:[%22STAR%20-%20Counts%22]%7D%7D]%7D&format=JSON error: cannot open the connection

We will retry to access GDC!

--------------------

oo Filtering results

--------------------

ooo By experimental.strategy ooo By data.type ooo By workflow.type

----------------

oo Checking data

----------------

ooo Checking if there are duplicated cases ooo Checking if there are results for the query

-------------------

o Preparing output

-------------------

Warning messages: 1: In open.connection(con, "rb") : InternetOpenUrl failed: 'Operation timeout' 2: In open.connection(con, "rb") : InternetOpenUrl failed: 'Operation timeout' 3: In open.connection(con, "rb") : InternetOpenUrl failed: 'Operation timeout'

THANKS!!!

Nshriwash commented 1 year ago

This error is due to a timeout for connecting the URL. This is clearly mentioned in the warnings given at the end. However the GDCquery function automatically retries to make the connection. "Preparing output" means that you have got the results of your query. So in this case it worked well. Nothing to worry about here as far as I know.