lawremi / rtracklayer

R interface to genome annotation files and the UCSC genome browser
Other
26 stars 16 forks source link

getTable() not working with hg38 UCSC session #105

Closed kkmarkel closed 7 months ago

kkmarkel commented 7 months ago

Hi!

I want to get gaps track from UCSC for hg38 assembly. Everything worked fine a week ago but from yesterday I keep getting an error:

mySession <- browserSession()
genome(mySession) <- 'hg38'
gaps <- getTable(ucscTableQuery(mySession, table="gap"))
>> Error in errorHandler(responseError) : Internal Server Error

The error persists if I try with a different track, e.g.:

gaps <- getTable(ucscTableQuery(mySession, table="tRNAs"))
>> Error in errorHandler(responseError) : Internal Server Error

Even though the session seems fine and trackNames is working

trackNames(ucscTableQuery(mySession))
>> 1000 Genomes Trios 
>> "tgpTrios" 
>> 1000G Ph3 Vars 
>> "tgpPhase3"
...

Tried with hg19 assembly and it worked fine

genome(mySession) <- 'hg19'
gaps <- getTable(ucscTableQuery(mySession, table="gap"))
head(gaps)
# works

Not sure what to do with this...

sessionInfo

Here is my sessionInfo():

R version 4.3.1 (2023-06-16)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: Ubuntu 20.04.6 LTS

Matrix products: default
BLAS/LAPACK: /opt/miniconda3/envs/fragmentome/lib/libopenblasp-r0.3.23.so;  LAPACK version 3.11.0

locale:
 [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8       
 [4] LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8   
 [7] LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C          
[10] LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   

time zone: Europe/Moscow
tzcode source: system (glibc)

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] biovizBase_1.50.0                      
 [2] devtools_2.4.5                         
 [3] usethis_2.2.2                          
 [4] Rsamtools_2.18.0                       
 [5] BSgenome.Hsapiens.UCSC.hg38_1.4.5      
 [6] BSgenome_1.70.1                        
 [7] BiocIO_1.12.0                          
 [8] Biostrings_2.70.1                      
 [9] XVector_0.42.0                         
[10] Homo.sapiens_1.3.1                     
[11] TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2
[12] org.Hs.eg.db_3.18.0                    
[13] GO.db_3.18.0                           
[14] OrganismDbi_1.44.0                     
[15] GenomicFeatures_1.54.1                 
[16] AnnotationDbi_1.64.1                   
[17] Biobase_2.62.0                         
[18] rtracklayer_1.62.0                     
[19] GenomicRanges_1.54.1                   
[20] GenomeInfoDb_1.38.1                    
[21] IRanges_2.36.0                         
[22] S4Vectors_0.40.1                       
[23] BiocGenerics_0.48.1                    

loaded via a namespace (and not attached):
  [1] RColorBrewer_1.1-3          rstudioapi_0.15.0          
  [3] magrittr_2.0.3              rmarkdown_2.25             
  [5] fs_1.6.3                    zlibbioc_1.48.0            
  [7] vctrs_0.6.4                 memoise_2.0.1              
  [9] RCurl_1.98-1.13             base64enc_0.1-3            
 [11] htmltools_0.5.7             S4Arrays_1.2.0             
 [13] progress_1.2.2              curl_5.0.1                 
 [15] SparseArray_1.2.2           Formula_1.2-5              
 [17] htmlwidgets_1.6.2           cachem_1.0.8               
 [19] GenomicAlignments_1.38.0    mime_0.12                  
 [21] lifecycle_1.0.4             pkgconfig_2.0.3            
 [23] Matrix_1.6-3                R6_2.5.1                   
 [25] fastmap_1.1.1               GenomeInfoDbData_1.2.11    
 [27] MatrixGenerics_1.14.0       shiny_1.7.5.1              
 [29] digest_0.6.33               colorspace_2.1-0           
 [31] ps_1.7.5                    pkgload_1.3.3              
 [33] Hmisc_5.1-1                 RSQLite_2.3.3              
 [35] filelock_1.0.2              fansi_1.0.5                
 [37] httr_1.4.7                  abind_1.4-5                
 [39] compiler_4.3.1              remotes_2.4.2.1            
 [41] bit64_4.0.5                 backports_1.4.1            
 [43] htmlTable_2.4.2             BiocParallel_1.36.0        
 [45] DBI_1.1.3                   pkgbuild_1.4.2             
 [47] biomaRt_2.58.0              rappdirs_0.3.3             
 [49] DelayedArray_0.28.0         sessioninfo_1.2.2          
 [51] rjson_0.2.21                tools_4.3.1                
 [53] foreign_0.8-85              httpuv_1.6.12              
 [55] nnet_7.3-19                 glue_1.6.2                 
 [57] restfulr_0.0.15             callr_3.7.3                
 [59] promises_1.2.1              grid_4.3.1                 
 [61] checkmate_2.3.0             cluster_2.1.4              
 [63] generics_0.1.3              gtable_0.3.4               
 [65] ensembldb_2.26.0            data.table_1.14.8          
 [67] hms_1.1.3                   xml2_1.3.5                 
 [69] utf8_1.2.4                  pillar_1.9.0               
 [71] stringr_1.5.1               later_1.3.1                
 [73] dplyr_1.1.3                 BiocFileCache_2.10.1       
 [75] lattice_0.22-5              bit_4.0.5                  
 [77] tidyselect_1.2.0            RBGL_1.78.0                
 [79] miniUI_0.1.1.1              knitr_1.45                 
 [81] gridExtra_2.3               ProtGenerics_1.34.0        
 [83] SummarizedExperiment_1.32.0 xfun_0.41                  
 [85] matrixStats_1.1.0           stringi_1.8.1              
 [87] lazyeval_0.2.2              yaml_2.3.7                 
 [89] evaluate_0.23               codetools_0.2-19           
 [91] tibble_3.2.1                BiocManager_1.30.22        
 [93] graph_1.80.0                cli_3.6.1                  
 [95] rpart_4.1.21                xtable_1.8-4               
 [97] munsell_0.5.0               processx_3.8.2             
 [99] dichromat_2.0-0.1           Rcpp_1.0.11                
[101] dbplyr_2.4.0                png_0.1-8                  
[103] XML_3.99-0.15               parallel_4.3.1             
[105] ellipsis_0.3.2              ggplot2_3.4.4              
[107] blob_1.2.4                  prettyunits_1.2.0          
[109] profvis_0.3.8               AnnotationFilter_1.26.0    
[111] urlchecker_1.0.1            bitops_1.0-7               
[113] VariantAnnotation_1.48.0    scales_1.2.1               
[115] purrr_1.0.2                 crayon_1.5.2               
[117] rlang_1.1.2                 KEGGREST_1.42.0 
sanchit-saini commented 7 months ago

Workaround

library(rtracklayer)
mySession <- browserSession()
genome(mySession) <- 'hg38'
q <- ucscTableQuery(mySession, table="gap", url="https://genome-asia.ucsc.edu/cgi-bin/")
gaps <- getTable(q)

Cause

library(rtracklayer)
mySession <- browserSession()
genome(mySession) <- 'hg38'
q <- ucscTableQuery(mySession, table="gap", url="https://genome-asia.ucsc.edu/cgi-bin/")

options(verbose = T)
gaps <- getTable(q)
# READ: https://genome.ucsc.edu/cgi-bin/hubApi/getData/track?genome=hg38&track=gap

If we open above API request URL it show the following error: { "downloadTime": "2023:11:24T09:44:32Z", "downloadTimeStamp": 1700819072, "error": "can not find schema definition for table 'gap', genome: 'hg38'", "statusCode": 500, "statusMessage": "Internal Server Error"}

I am not sure why it is not present on the Europe Mirror. However, if we change the Mirror to Asia, It works. As the table gap is available on it.

https://genome.ucsc.edu/cgi-bin/hubApi/getData/track?genome=hg38&track=gap
https://genome-asia.ucsc.edu/cgi-bin/hubApi/getData/track?genome=hg38&track=gap

Note: I took the mirror information from https://genome.ucsc.edu/mirror.html

kkmarkel commented 7 months ago

@sanchit-saini Thank you so much! Changing the Mirror to Asia worked well.