Bioconductor / GenomicDataCommons

Provide R access to the NCI Genomic Data Commons portal.
http://bioconductor.github.io/GenomicDataCommons/
84 stars 23 forks source link

Timeout was reached: [api.gdc.cancer.gov] #113

Closed mansi-aggarwal-2504 closed 1 year ago

mansi-aggarwal-2504 commented 1 year ago

Hi,

I have been unable to connect to the GDC server for a few hours now.

My query:

query <- GDCquery(project = "TCGA-SKCM",
                  data.category = "Transcriptome Profiling",
                  data.type = "Gene Expression Quantification",
                  experimental.strategy = "RNA-Seq")

Errors that I have been getting:

Screen Shot 2023-10-04 at 8 29 51 pm

or

Screen Shot 2023-10-04 at 7 06 30 pm

My session info:

> sessionInfo()
R version 4.2.1 (2022-06-23)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Monterey 12.4

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRlapack.dylib

locale:
[1] en_AU.UTF-8/en_AU.UTF-8/en_AU.UTF-8/C/en_AU.UTF-8/en_AU.UTF-8

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] biomaRt_2.54.1              stringdist_0.9.10           stringr_1.5.0              
 [4] dplyr_1.1.3                 SummarizedExperiment_1.28.0 Biobase_2.58.0             
 [7] GenomicRanges_1.50.2        GenomeInfoDb_1.34.9         IRanges_2.32.0             
[10] S4Vectors_0.36.2            BiocGenerics_0.44.0         MatrixGenerics_1.10.0      
[13] matrixStats_1.0.0           DT_0.29                     TCGAbiolinks_2.29.6        
[16] BiocManager_1.30.22        

loaded via a namespace (and not attached):
 [1] bitops_1.0-7                fs_1.6.3                    usethis_2.2.2              
 [4] devtools_2.4.5              bit64_4.0.5                 filelock_1.0.2             
 [7] progress_1.2.2              httr_1.4.7                  tools_4.2.1                
[10] profvis_0.3.8               utf8_1.2.3                  R6_2.5.1                   
[13] DBI_1.1.3                   colorspace_2.1-0            urlchecker_1.0.1           
[16] tidyselect_1.2.0            prettyunits_1.1.1           processx_3.8.2             
[19] curl_5.0.2                  bit_4.0.5                   compiler_4.2.1             
[22] rvest_1.0.3                 cli_3.6.1                   xml2_1.3.5                 
[25] DelayedArray_0.24.0         scales_1.2.1                readr_2.1.4                
[28] callr_3.7.3                 rappdirs_0.3.3              digest_0.6.33              
[31] rmarkdown_2.24              XVector_0.38.0              pkgconfig_2.0.3            
[34] htmltools_0.5.6             sessioninfo_1.2.2           dbplyr_2.3.3               
[37] fastmap_1.1.1               htmlwidgets_1.6.2           rlang_1.1.1                
[40] rstudioapi_0.15.0           RSQLite_2.3.1               shiny_1.7.5                
[43] generics_0.1.3              jsonlite_1.8.7              RCurl_1.98-1.12            
[46] magrittr_2.0.3              GenomeInfoDbData_1.2.9      Matrix_1.5-4               
[49] Rcpp_1.0.11                 munsell_0.5.0               fansi_1.0.4                
[52] lifecycle_1.0.3             stringi_1.7.12              yaml_2.3.7                 
[55] zlibbioc_1.44.0             plyr_1.8.8                  BiocFileCache_2.6.1        
[58] pkgbuild_1.4.2              grid_4.2.1                  blob_1.2.4                 
[61] parallel_4.2.1              promises_1.2.1              crayon_1.5.2               
[64] miniUI_0.1.1.1              lattice_0.21-8              Biostrings_2.66.0          
[67] KEGGREST_1.38.0             hms_1.1.3                   knitr_1.43                 
[70] ps_1.7.5                    pillar_1.9.0                TCGAbiolinksGUI.data_1.15.1
[73] pkgload_1.3.2.1             XML_3.99-0.14               glue_1.6.2                 
[76] evaluate_0.21               downloader_0.4              data.table_1.14.8          
[79] remotes_2.4.2.1             png_0.1-8                   vctrs_0.6.3                
[82] tzdb_0.4.0                  httpuv_1.6.11               tidyr_1.3.0                
[85] gtable_0.3.4                purrr_1.0.2                 cachem_1.0.8               
[88] ggplot2_3.4.3               xfun_0.40                   mime_0.12                  
[91] xtable_1.8-4                later_1.3.1                 tibble_3.2.1               
[94] AnnotationDbi_1.60.2        memoise_2.0.1               ellipsis_0.3.2           

I have tried reloading the package again as suggested here: https://www.biostars.org/p/321591/ and increasing R session timeout as suggested here: https://www.biostars.org/p/9542366/

Also, the data has already been downloaded on my local machine and I want to read it in the desired format using GDCprepare. Is there any other way to do that without connecting to API again?

Thanks, Mansi

vjcitn commented 1 year ago

Thanks for your note. The package in question is TCGAbiolinks. I just tried your query without difficulty

> query <- GDCquery(project = "TCGA-SKCM",
                  data.category = "Transcriptome Profiling",
                  data.type = "Gene Expression Quantification",
                  experimental.strategy = "RNA-Seq")
--------------------------------------
o GDCquery: Searching in GDC database
--------------------------------------
Genome of reference: hg38
--------------------------------------------
oo Accessing GDC. This might take a while...
--------------------------------------------
ooo Project: TCGA-SKCM
--------------------
oo Filtering results
--------------------
ooo By experimental.strategy
ooo By data.type
----------------
oo Checking data
----------------
ooo Checking if there are duplicated cases
ooo Checking if there are results for the query
-------------------
o Preparing output
-------------------

It seems you have a problem with network connectivity, or bad luck with server downtime. Could you get the relevant information using curatedTCGAData package? Apropos GDCprepare: Please contact the author of TCGAbiolinks if the documentation does not address your concern.

mansi-aggarwal-2504 commented 1 year ago

Hello,

I tried the same query today and was able to run it without errors today. But now, preparing the data is giving me server downtime error:

> rna_data <- GDCprepare(query = query, directory = "GDCdata_RNA")
Warning: URL 'https://api.gdc.cancer.gov/status': Timeout of 60 seconds was reachedError in value[[3L]](cond) : 
  GDC server down, try to use this package later
LiNk-NY commented 1 year ago

Hi Mansi, @mansi-aggarwal-2504 Please direct your questions to the appropriate repository: https://github.com/BioinformaticsFMRP/TCGAbiolinks Best regards, Marcel