BioinformaticsFMRP / TCGAbiolinks

TCGAbiolinks
http://bioconductor.org/packages/devel/bioc/vignettes/TCGAbiolinks/inst/doc/index.html
298 stars 112 forks source link

Timeout was reached: [api.gdc.cancer.gov] #604

Closed mansi-aggarwal-2504 closed 1 year ago

mansi-aggarwal-2504 commented 1 year ago

Hi,

I have been unable to connect to the GDC server for a few hours now.

My query:

query <- GDCquery(project = "TCGA-SKCM",
                  data.category = "Transcriptome Profiling",
                  data.type = "Gene Expression Quantification",
                  experimental.strategy = "RNA-Seq")

Errors that I have been getting:

Screen Shot 2023-10-04 at 7 06 30 pm Screen Shot 2023-10-04 at 6 57 52 pm

My session info:

> sessionInfo()
R version 4.2.1 (2022-06-23)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Monterey 12.4

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRlapack.dylib

locale:
[1] en_AU.UTF-8/en_AU.UTF-8/en_AU.UTF-8/C/en_AU.UTF-8/en_AU.UTF-8

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] biomaRt_2.54.1              stringdist_0.9.10           stringr_1.5.0              
 [4] dplyr_1.1.3                 SummarizedExperiment_1.28.0 Biobase_2.58.0             
 [7] GenomicRanges_1.50.2        GenomeInfoDb_1.34.9         IRanges_2.32.0             
[10] S4Vectors_0.36.2            BiocGenerics_0.44.0         MatrixGenerics_1.10.0      
[13] matrixStats_1.0.0           DT_0.29                     TCGAbiolinks_2.29.6        
[16] BiocManager_1.30.22        

loaded via a namespace (and not attached):
 [1] bitops_1.0-7                fs_1.6.3                    usethis_2.2.2              
 [4] devtools_2.4.5              bit64_4.0.5                 filelock_1.0.2             
 [7] progress_1.2.2              httr_1.4.7                  tools_4.2.1                
[10] profvis_0.3.8               utf8_1.2.3                  R6_2.5.1                   
[13] DBI_1.1.3                   colorspace_2.1-0            urlchecker_1.0.1           
[16] tidyselect_1.2.0            prettyunits_1.1.1           processx_3.8.2             
[19] curl_5.0.2                  bit_4.0.5                   compiler_4.2.1             
[22] rvest_1.0.3                 cli_3.6.1                   xml2_1.3.5                 
[25] DelayedArray_0.24.0         scales_1.2.1                readr_2.1.4                
[28] callr_3.7.3                 rappdirs_0.3.3              digest_0.6.33              
[31] rmarkdown_2.24              XVector_0.38.0              pkgconfig_2.0.3            
[34] htmltools_0.5.6             sessioninfo_1.2.2           dbplyr_2.3.3               
[37] fastmap_1.1.1               htmlwidgets_1.6.2           rlang_1.1.1                
[40] rstudioapi_0.15.0           RSQLite_2.3.1               shiny_1.7.5                
[43] generics_0.1.3              jsonlite_1.8.7              RCurl_1.98-1.12            
[46] magrittr_2.0.3              GenomeInfoDbData_1.2.9      Matrix_1.5-4               
[49] Rcpp_1.0.11                 munsell_0.5.0               fansi_1.0.4                
[52] lifecycle_1.0.3             stringi_1.7.12              yaml_2.3.7                 
[55] zlibbioc_1.44.0             plyr_1.8.8                  BiocFileCache_2.6.1        
[58] pkgbuild_1.4.2              grid_4.2.1                  blob_1.2.4                 
[61] parallel_4.2.1              promises_1.2.1              crayon_1.5.2               
[64] miniUI_0.1.1.1              lattice_0.21-8              Biostrings_2.66.0          
[67] KEGGREST_1.38.0             hms_1.1.3                   knitr_1.43                 
[70] ps_1.7.5                    pillar_1.9.0                TCGAbiolinksGUI.data_1.15.1
[73] pkgload_1.3.2.1             XML_3.99-0.14               glue_1.6.2                 
[76] evaluate_0.21               downloader_0.4              data.table_1.14.8          
[79] remotes_2.4.2.1             png_0.1-8                   vctrs_0.6.3                
[82] tzdb_0.4.0                  httpuv_1.6.11               tidyr_1.3.0                
[85] gtable_0.3.4                purrr_1.0.2                 cachem_1.0.8               
[88] ggplot2_3.4.3               xfun_0.40                   mime_0.12                  
[91] xtable_1.8-4                later_1.3.1                 tibble_3.2.1               
[94] AnnotationDbi_1.60.2        memoise_2.0.1               ellipsis_0.3.2          

I have tried reloading the package again as suggested here: https://www.biostars.org/p/321591/ and increasing R session timeout as suggested here: https://www.biostars.org/p/9542366/

Also, the data has already been downloaded on my local machine and I want to read it in the desired format using GDCprepare. Is there any other way to do that without connecting to API again?

Thanks, Mansi

tiagochst commented 1 year ago

Please. It is working on my side. Do you still have the issue ? It could be either that GDC was down for maintenance or a firewall might be blocking the access.

Screenshot 2023-10-05 at 10 40 10 AM
mansi-aggarwal-2504 commented 1 year ago

It was quite strange as https://api.gdc.cancer.gov/status said that the status was OK. It surprisingly worked the next morning.

Thanks!

she3o commented 2 weeks ago

Hi, I think this issue should be reopened. I consistently get "SSL"-related errors. For example, just now:

... {targets} related output ...
... btw I also get errors without {targets} ...
URL 'https://api.gdc.cancer.gov/cases/?pretty=true&expand=samples,project,diagnoses,diagnoses.treatments,annotations,family_histories,demographic,exposures&size=10&filters=%7B%22op%22:%22or%22,%22content%22:[%7B%22op%22:%22in%22,%22content%22:%7B%22field%22:%22cases.submitter_id%22,%22value%22:["TCGA-5L-AAT0-01A","TCGA-A2-A04U-01A","TCGA-AN-A04A-01A","TCGA-A7-A13D-01A","TCGA-BH-A201-01A","TCGA-BH-A0H6-01A","TCGA-A2-A0YL-01A","TCGA-A2-A04R-01A","TCGA-AN-A03X-01A","TCGA-AC-A3EH-01A%22]%7D%7D,%7B%22op%22:%22in%22,%22content%22:%7B%22field%22:%22submitter_sample_ids%22,%22value%22:["TCGA-5L-AAT0-01A","TCGA-A2-A04U-01A","TCGA-AN-A04A-01A","TCGA-A7-A13D-01A","TCGA-BH-A201-01A","TCGA-BH-A0H6-01A","TCGA-A2-A0YL-01A","TCGA-A2-A04R-01A","TCGA-AN-A03X-01A","TCGA-AC-A3EH-01A%22]%7D%7D,%7B%22op%22:%22in%22,%22content%22:%7B%22field%22:%22submitter_aliquot_ids%22,%22value%22:["TCGA-5L-AAT0-01A","TCGA-A2-A04U-01A","TCGA-AN-A04A-01A","TCGA-A7-A13D-01A","TCGA-BH-A201-01A","TCGA-BH-A0H6-01A","TCGA-A2-A0 [... truncated]
3: URL 'https://api.gdc.cancer.gov/status': status was 'SSL connect error'
... {targets} related output ...

A project I was working on used to work fine. then I started getting "SSL" errors. I didn't report it for a long time because I assumed that a system upgrade also upgraded OpenSSL which broke linking somehow, and I couldn't imagine the GDC server could be down so often. However recently I reproduced these kinds of errors in a docker container, which I assume downloads its own system libraries. So I can only guess it's from a firewall, the GDC Server, or TCGAbiolinks itself.

Anyway I think that regardless of the cause, One thing that could be improved is GDCprepare logs that help users diagnose the problem.

I will try to look onto this and share anything interesting.