databio / GenomicDistributions

Calculate and plot distributions of genomic ranges
http://code.databio.org/GenomicDistributions
Other
25 stars 10 forks source link

mm10 genome not available in GenomicDistributions or ExperimentHub #196

Closed rrashford closed 2 years ago

rrashford commented 2 years ago

Hello! I am currently trying to use one of the built-in reference genomes, mm10, in the GenomicDistributionsData package to make different plots from the GenomicDistributions package for my own analysis. From testing out the example shown here (https://www.bioconductor.org/packages/devel/bioc/vignettes/GenomicDistributions/inst/doc/intro.html#loading-genomic-range-data), everything worked fine if I used the hg19 genome. The problem came when I tried to change the genome to "mm10".

Below is what happens when I run calcChromBinsRef.

> ## load necessary packages
> library("GenomeInfoDb")
> library("GenomicDistributions")
> library("GenomicDistributionsData")
> library("ExperimentHub")
> library("BSgenome")
> library("GenomicRanges")
> 
> 
> ## from Nate Sheffield's example (https://www.bioconductor.org/packages/devel/bioc/vignettes/GenomicDistributions/inst/doc/intro.html#custom-features-partitions)
> queryFile = system.file("extdata", "vistaEnhancers.bed.gz", package="GenomicDistributions")
> query = rtracklayer::import(queryFile)
> 
> # calculate the distribution:
> x = calcChromBinsRef(query, "mm10")
Error in getReferenceData(refAssembly, tagline = "chromSizes_") : 
  chromSizes_mm10 not available in GenomicDistributions and GenomicDistributionsData packages

From asking about this on the BioConductor site, it seems that the mm10 genome isn't loaded in the ExperimentHub package. Am I approaching this incorrectly to load the mm10 genome? Or in the case that the genome isn't loaded yet in ExperimentHub, when should it be expected?

The main two plots I'm trying to make are the chromosome distribution plot and the partition distribution plot. Maybe there's a way around while waiting on the genome to be loaded into the package(s)?

Thanks!

session info:

sessionInfo()
R version 4.1.0 (2021-05-18)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19044)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] BSgenome_1.60.0                rtracklayer_1.52.1             Biostrings_2.60.2             
 [4] XVector_0.32.0                 ExperimentHub_2.0.0            AnnotationHub_3.0.2           
 [7] BiocFileCache_2.0.0            dbplyr_2.2.0                   GenomicDistributionsData_1.0.0
[10] GenomicDistributions_1.0.0     GenomicRanges_1.44.0           GenomeInfoDb_1.28.4           
[13] IRanges_2.26.0                 S4Vectors_0.30.2               BiocGenerics_0.38.0           

loaded via a namespace (and not attached):
 [1] ProtGenerics_1.24.0           bitops_1.0-7                  matrixStats_0.62.0           
 [4] bit64_4.0.5                   progress_1.2.2                filelock_1.0.2               
 [7] httr_1.4.3                    tools_4.1.0                   utf8_1.2.2                   
[10] R6_2.5.1                      lazyeval_0.2.2                DBI_1.1.3                    
[13] colorspace_2.0-3              prettyunits_1.1.1             tidyselect_1.1.2             
[16] bit_4.0.4                     curl_4.3.2                    compiler_4.1.0               
[19] cli_3.3.0                     Biobase_2.52.0                xml2_1.3.3                   
[22] DelayedArray_0.18.0           scales_1.2.0                  rappdirs_0.3.3               
[25] stringr_1.4.0                 digest_0.6.29                 Rsamtools_2.8.0              
[28] rmarkdown_2.14                pkgconfig_2.0.3               htmltools_0.5.2              
[31] MatrixGenerics_1.4.3          ensembldb_2.16.4              fastmap_1.1.0                
[34] rlang_1.0.2                   rstudioapi_0.13               RSQLite_2.2.14               
[37] shiny_1.7.1                   farver_2.1.0                  BiocIO_1.2.0                 
[40] generics_0.1.2                BiocParallel_1.26.2           dplyr_1.0.9                  
[43] RCurl_1.98-1.7                magrittr_2.0.3                GenomeInfoDbData_1.2.6       
[46] Matrix_1.4-1                  Rcpp_1.0.8.3                  munsell_0.5.0                
[49] fansi_1.0.3                   lifecycle_1.0.1               stringi_1.7.6                
[52] yaml_2.3.5                    SummarizedExperiment_1.22.0   zlibbioc_1.38.0              
[55] plyr_1.8.7                    grid_4.1.0                    blob_1.2.3                   
[58] promises_1.2.0.1              crayon_1.5.1                  lattice_0.20-45              
[61] GenomicFeatures_1.44.2        hms_1.1.1                     KEGGREST_1.32.0              
[64] knitr_1.39                    pillar_1.7.0                  rjson_0.2.21                 
[67] biomaRt_2.48.3                reshape2_1.4.4                XML_3.99-0.10                
[70] glue_1.6.2                    BiocVersion_3.13.1            evaluate_0.15                
[73] data.table_1.14.2             BiocManager_1.30.18           png_0.1-7                    
[76] vctrs_0.4.1                   httpuv_1.6.5                  gtable_0.3.0                 
[79] purrr_0.3.4                   assertthat_0.2.1              cachem_1.0.6                 
[82] ggplot2_3.3.6                 xfun_0.31                     mime_0.12                    
[85] xtable_1.8-6                  AnnotationFilter_1.16.0       restfulr_0.0.15              
[88] later_1.3.0                   tibble_3.1.7                  AnnotationDbi_1.54.1         
[91] GenomicAlignments_1.28.0      memoise_2.0.1                 ellipsis_0.3.2               
[94] interactiveDisplayBase_1.30.0
kkupkova commented 2 years ago

Hi!

We will work on this issue that started with the new Bioconductor version. In the meantime you can try to install GenomicDistributionData package from our local directory with following command: install.packages("http://big.databio.org/GenomicDistributionsData/GenomicDistributionsData_0.0.2.tar.gz", repos=NULL)

The data package is the same version as the one hosted on Bioconductor ExperimentHub and it used to solve similar issues in the past.

Please let us know if this solved your issues.

rrashford commented 2 years ago

Yes, downloading from the local directory worked. Thanks so much!

kkupkova commented 2 years ago

I just released new version of GenomicDistributions (1.5.2 on github - 1.4.3/1.5.2(dev) via Bioconductor will be available in a day or two) that should take care of this issue even with GenomicDistributionsData installed via Bioconductor. Thank you for your input!