Content for the OSCA Book.
Problem in downloading LunSpikeInData #63

MAmin99m commented 2 years ago

Hi When I run this code:


it gives me the below output:

snapshotDate(): 2022-04-26
see ?scRNAseq and browseVignettes('scRNAseq') for documentation
loading from cache
Error: failed to load resource
  name: EH2674
  title: Lun 416B plus spike-in counts
  reason: error reading from connection

I tried other functions of scRNAseq package, and I could download data, but for LunSpikeInData( ), I couldn't download data.

PeteHaitch commented 2 years ago

What's the output of running BiocManager::valid()? I just tested this and it worked for me on the following system (this is the current release version of Bioconductor albeit with some slightly out-of-date packages):

> BiocManager::valid()
'getOption("repos")' replaces Bioconductor standard repositories, see '?repositories' for details

replacement repositories:

* sessionInfo()

R version 4.2.1 (2022-06-23)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.4 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/

 [1] LC_CTYPE=en_AU.UTF-8       LC_NUMERIC=C               LC_TIME=en_AU.UTF-8       
 [7] LC_PAPER=en_AU.UTF-8       LC_NAME=C                  LC_ADDRESS=C              

attached base packages:
[1] stats4    stats     graphics  grDevices datasets  utils     methods   base     

other attached packages:
 [1] ensembldb_2.20.1            AnnotationFilter_1.20.0     GenomicFeatures_1.48.1     
 [4] AnnotationDbi_1.58.0        scRNAseq_2.10.0             SingleCellExperiment_1.18.0
 [7] SummarizedExperiment_1.26.1 Biobase_2.56.0              GenomicRanges_1.48.0       
[10] GenomeInfoDb_1.32.2         IRanges_2.30.0              S4Vectors_0.34.0           
[13] BiocGenerics_0.42.0         MatrixGenerics_1.8.0        matrixStats_0.62.0         

loaded via a namespace (and not attached):
 [1] ProtGenerics_1.28.0           bitops_1.0-7                  bit64_4.0.5                  
 [4] filelock_1.0.2                progress_1.2.2                httr_1.4.3                   
 [7] tools_4.2.1                   utf8_1.2.2                    R6_2.5.1                     
[10] DBI_1.1.3                     lazyeval_0.2.2                tidyselect_1.1.2             
[13] prettyunits_1.1.1             bit_4.0.4                     curl_4.3.2                   
[16] compiler_4.2.1                cli_3.3.0                     xml2_1.3.3                   
[19] DelayedArray_0.22.0           rtracklayer_1.56.0            rappdirs_0.3.3               
[22] stringr_1.4.0                 digest_0.6.29                 Rsamtools_2.12.0             
[25] bspm_0.3.9                    XVector_0.36.0                pkgconfig_2.0.3              
[28] htmltools_0.5.2               dbplyr_2.2.1                  fastmap_1.1.0                
[31] rlang_1.0.4                   rstudioapi_0.13               RSQLite_2.2.14               
[34] shiny_1.7.1                   BiocIO_1.6.0                  generics_0.1.3               
[37] BiocParallel_1.30.0           dplyr_1.0.9                   RCurl_1.98-1.7               
[40] magrittr_2.0.3                GenomeInfoDbData_1.2.8        Matrix_1.4-1                 
[43] Rcpp_1.0.9                    fansi_1.0.3                   lifecycle_1.0.1              
[46] stringi_1.7.8                 yaml_2.3.5                    zlibbioc_1.42.0              
[49] BiocFileCache_2.4.0           AnnotationHub_3.4.0           grid_4.2.1                   
[52] blob_1.2.3                    parallel_4.2.1                promises_1.2.0.1             
[55] ExperimentHub_2.4.0           crayon_1.5.1                  lattice_0.20-45              
[58] Biostrings_2.64.0             hms_1.1.1                     KEGGREST_1.36.0              
[61] pillar_1.7.0                  rjson_0.2.21                  biomaRt_2.52.0               
[64] XML_3.99-0.9                  glue_1.6.2                    BiocVersion_3.15.2           
[67] BiocManager_1.30.18           png_0.1-7                     vctrs_0.4.1                  
[70] httpuv_1.6.5                  purrr_0.3.4                   assertthat_0.2.1             
[73] cachem_1.0.6                  mime_0.12                     xtable_1.8-4                 
[76] restfulr_0.0.13               later_1.3.0                   tibble_3.1.7                 
[79] GenomicAlignments_1.32.0      memoise_2.0.1                 ellipsis_0.3.2               
[82] interactiveDisplayBase_1.34.0

Bioconductor version '3.15'

  * 35 packages out-of-date
  * 0 packages too new

create a valid installation with

    "BiocParallel", "edgeR", "ensembldb", "GenomicFeatures", "gtools", "HDF5Array", "htmltools", "igraph",
    "KEGGREST", "limma", "lme4", "locfit", "MatrixGenerics", "pillar", "R.methodsS3", "R.oo", "R.utils",
    "RcppHNSW", "restfulr", "reticulate", "rgl", "Rhdf5lib", "RSQLite", "rtracklayer", "scuttle", "shiny",
    "shinyWidgets", "tidyverse", "XML"
  ), update = TRUE, ask = FALSE)

more details: BiocManager::valid()$too_new, BiocManager::valid()$out_of_date
MAmin99m commented 2 years ago

This is the output of BiocManager::valid( ) :

> BiocManager::valid()
'getOption("repos")' replaces Bioconductor standard repositories, see '?repositories' for details

replacement repositories:

* sessionInfo()

R version 4.2.1 (2022-06-23)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.4 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/

 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8       
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C              

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] scRNAseq_2.10.0             SingleCellExperiment_1.18.0 SummarizedExperiment_1.26.1
 [4] Biobase_2.56.0              GenomicRanges_1.48.0        GenomeInfoDb_1.32.2        
 [7] IRanges_2.30.0              MatrixGenerics_1.8.1        matrixStats_0.62.0         
[10] BiocFileCache_2.4.0         dbplyr_2.2.1                S4Vectors_0.34.0           
[13] BiocGenerics_0.42.0        

loaded via a namespace (and not attached):
 [1] ProtGenerics_1.28.0           bitops_1.0-7                  bit64_4.0.5                  
 [4] filelock_1.0.2                progress_1.2.2                httr_1.4.3                   
 [7] tools_4.2.1                   utf8_1.2.2                    R6_2.5.1                     
[10] DBI_1.1.3                     lazyeval_0.2.2                tidyselect_1.1.2             
[13] prettyunits_1.1.1             bit_4.0.4                     curl_4.3.2                   
[16] compiler_4.2.1                cli_3.3.0                     xml2_1.3.3                   
[19] DelayedArray_0.22.0           rtracklayer_1.56.1            rappdirs_0.3.3               
[22] stringr_1.4.0                 digest_0.6.29                 Rsamtools_2.12.0             
[25] XVector_0.36.0                pkgconfig_2.0.3               htmltools_0.5.3              
[28] fastmap_1.1.0                 ensembldb_2.20.2              rlang_1.0.4                  
[31] rstudioapi_0.13               RSQLite_2.2.15                shiny_1.7.2                  
[34] BiocIO_1.6.0                  generics_0.1.3                BiocParallel_1.30.3          
[37] dplyr_1.0.9                   RCurl_1.98-1.7                magrittr_2.0.3               
[40] GenomeInfoDbData_1.2.8        Matrix_1.4-1                  Rcpp_1.0.9                   
[43] fansi_1.0.3                   lifecycle_1.0.1               stringi_1.7.8                
[46] yaml_2.3.5                    zlibbioc_1.42.0               AnnotationHub_3.4.0          
[49] grid_4.2.1                    blob_1.2.3                    parallel_4.2.1               
[52] promises_1.2.0.1              ExperimentHub_2.4.0           crayon_1.5.1                 
[55] lattice_0.20-45               Biostrings_2.64.0             GenomicFeatures_1.48.3       
[58] hms_1.1.1                     KEGGREST_1.36.3               pillar_1.8.0                 
[61] rjson_0.2.21                  codetools_0.2-18              biomaRt_2.52.0               
[64] XML_3.99-0.10                 glue_1.6.2                    BiocVersion_3.15.2           
[67] BiocManager_1.30.18           png_0.1-7                     vctrs_0.4.1                  
[70] httpuv_1.6.5                  purrr_0.3.4                   assertthat_0.2.1             
[73] cachem_1.0.6                  mime_0.12                     xtable_1.8-4                 
[76] restfulr_0.0.15               AnnotationFilter_1.20.0       later_1.3.0                  
[79] tibble_3.1.7                  GenomicAlignments_1.32.0      AnnotationDbi_1.58.0         
[82] memoise_2.0.1                 ellipsis_0.3.2                interactiveDisplayBase_1.34.0

Bioconductor version '3.15'

  * 1 packages out-of-date
  * 0 packages too new

create a valid installation with

  BiocManager::install("igraph", update = TRUE, ask = FALSE)

more details: BiocManager::valid()$too_new, BiocManager::valid()$out_of_date

Warning message:
1 packages out-of-date; 0 packages too new
PeteHaitch commented 2 years ago

Hmm, nothing obviously wrong there. I suspect it might be due to a problem with your ExperimentHub installation (ExperimentHub is the underlying package responsible for data downloading and caching). Let's try accessing that resource directory from ExperimentHub; what does this code return on your machine:

eh <- ExperimentHub(ask = FALSE)
#> snapshotDate(): 2022-04-26
#> ExperimentHub with 1 record
#> # snapshotDate(): 2022-04-26
#> # names(): EH2674
#> # package(): scRNAseq
#> # $dataprovider: ArrayExpress
#> # $species: Mus musculus
#> # $rdataclass: matrix
#> # $rdatadateadded: 2019-07-01
#> # $title: Lun 416B plus spike-in counts
#> # $description: Count matrix for the Lun 416B (plus spike-ins) single-cell R...
#> # $taxonomyid: 10090
#> # $genome: mm10
#> # $sourcetype: TSV
#> # $sourceurl:
#> # $sourcesize: NA
#> # $tags: c("ExperimentHub", "ExperimentData", "ExpressionData",
#> #   "SequencingData", "RNASeqData") 
#> # retrieve record with 'object[["EH2674"]]'
#> see ?scRNAseq and browseVignettes('scRNAseq') for documentation
#> loading from cache
#>  int [1:46703, 1:192] 0 0 0 0 0 0 0 0 0 0 ...
#>  - attr(*, "dimnames")=List of 2
#>   ..$ : chr [1:46703] "ENSMUSG00000102693" "ENSMUSG00000064842" "ENSMUSG00000051951" "ENSMUSG00000102851" ...
#>   ..$ : chr [1:192] "SLX-9555.N701_S502.C89V9ANXX.s_1.r_1" "SLX-9555.N701_S503.C89V9ANXX.s_1.r_1" "SLX-9555.N701_S504.C89V9ANXX.s_1.r_1" "SLX-9555.N701_S505.C89V9ANXX.s_1.r_1" ...

Created on 2022-07-21 by the reprex package (v2.0.1)

MAmin99m commented 2 years ago

I think you're right because after removing the contents of.cache/R/ExperimentHub I can use LunSpikeInData() without any problem. Thanks for your help, Peter.

> suppressPackageStartupMessages(library(ExperimentHub))
> eh <- ExperimentHub(ask = FALSE)
snapshotDate(): 2022-04-26
> #> snapshotDate(): 2022-04-26
> eh["EH2674"]
ExperimentHub with 1 record
# snapshotDate(): 2022-04-26
# names(): EH2674
# package(): scRNAseq
# $dataprovider: ArrayExpress
# $species: Mus musculus
# $rdataclass: matrix
# $rdatadateadded: 2019-07-01
# $title: Lun 416B plus spike-in counts
# $description: Count matrix for the Lun 416B (plus spike-ins) single-cell RNA-seq dataset
# $taxonomyid: 10090
# $genome: mm10
# $sourcetype: TSV
# $sourceurl:
# $sourcesize: NA
# $tags: c("ExperimentHub", "ExperimentData", "ExpressionData", "SequencingData",
#   "RNASeqData") 
# retrieve record with 'object[["EH2674"]]' 
> str(eh[["EH2674"]])
see ?scRNAseq and browseVignettes('scRNAseq') for documentation
loading from cache
 int [1:46703, 1:192] 0 0 0 0 0 0 0 0 0 0 ...
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:46703] "ENSMUSG00000102693" "ENSMUSG00000064842" "ENSMUSG00000051951" "ENSMUSG00000102851" ...
  ..$ : chr [1:192] "SLX-9555.N701_S502.C89V9ANXX.s_1.r_1" "SLX-9555.N701_S503.C89V9ANXX.s_1.r_1" "SLX-9555.N701_S504.C89V9ANXX.s_1.r_1" "SLX-9555.N701_S505.C89V9ANXX.s_1.r_1" ...
PeteHaitch commented 2 years ago

Great, happy to help @mamalek99

LTLA commented 2 years ago

FYI you can force re-download of potentially corrupted cache elements with

ehub <- ExperimentHub()
ehub[["EH2674", force=TRUE]]

Assumes that the sqlite file itself is not corrupted, in which case nuking the entire directory is the only way.

MAmin99m commented 2 years ago

Thank you! @LTLA