lawremi / rtracklayer

R interface to genome annotation files and the UCSC genome browser
Other
28 stars 17 forks source link

genome(session) <- "hs1" doesn't work #84

Closed hpages closed 11 months ago

hpages commented 1 year ago
library(rtracklayer)

session <- browserSession()
genome(session)
# [1] "hg38"

"mm10" %in% ucscGenomes()$db
# [1] TRUE

genome(session) <- "mm10"
genome(session)
# [1] "mm10"

"hs1" %in% ucscGenomes()$db
# [1] TRUE

genome(session) <- "hs1"
# Error in `genome<-`(`*tmp*`, value = "hs1") : 
#   Failed to set session genome to 'hs1'

Also note that something strange happened to session:

genome(session)
# [1] "hub_3671779_hs1"

sessionInfo():

R Under development (unstable) (2023-02-26 r83908)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.2 LTS

Matrix products: default
BLAS:   /home/hpages/R/R-4.3.r83908/lib/libRblas.so 
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_GB              LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

time zone: America/Los_Angeles
tzcode source: system (glibc)

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
[1] rtracklayer_1.59.1   GenomicRanges_1.51.4 GenomeInfoDb_1.35.16
[4] IRanges_2.33.0       S4Vectors_0.37.4     BiocGenerics_0.45.2 

loaded via a namespace (and not attached):
 [1] crayon_1.5.2                DelayedArray_0.25.0        
 [3] SummarizedExperiment_1.29.1 GenomicAlignments_1.35.1   
 [5] rjson_0.2.21                RCurl_1.98-1.10            
 [7] Biostrings_2.67.0           XML_3.99-0.13              
 [9] MatrixGenerics_1.11.0       Biobase_2.59.0             
[11] grid_4.3.0                  restfulr_0.0.15            
[13] bitops_1.0-7                yaml_2.3.7                 
[15] compiler_4.3.0              codetools_0.2-19           
[17] XVector_0.39.0              BiocParallel_1.33.9        
[19] lattice_0.20-45             BiocIO_1.9.2               
[21] parallel_4.3.0              GenomeInfoDbData_1.2.9     
[23] Matrix_1.5-3                tools_4.3.0                
[25] matrixStats_0.63.0          Rsamtools_2.15.1           
[27] zlibbioc_1.45.0            
hpages commented 11 months ago

@lawremi @sanchit-saini

Looks like #93 somehow addresses this, with the following gotcha:

library(rtracklayer)
session <- browserSession()
genome(session) <- "hs1"

genome(session)
# [1] "hub_3671779_hs1"

So this sets the precedent that the genome() getter doesn't bring back the genome supplied by the user.

That would be ok if the mysterious genome name was considered valid by the setter, but that doesn't seem to be the case. In particular, there's the strong expectation that something like genome(session) <- genome(session) will always work and be a no-op:

genome(session) <- genome(session)
# Error in `genome<-`(`*tmp*`, value = "hub_3671779_hs1") : 
#   Failed to set session genome to 'hub_3671779_hs1'

Would you guys consider having the getter also pass the internal genome name thru sub(".*_", "", genome) before returning it to the user?

Thanks!

lawremi commented 11 months ago

Actually, I think just making the genome(x) return value match the user's input is sufficient, but I haven't done much testing. Will push that now.