SingleR-inc / celldex

Collection of cell type reference datasets.
https://bioconductor.org/packages/devel/data/experiment/html/celldex.html
44 stars 7 forks source link

Error: failed to load resource (but HumanPrimaryCellAtlasData is working) #6

Closed hypaik closed 3 years ago

hypaik commented 3 years ago

Hi, developer of celldex

I've installed celldex (BiocManager 3.12 and R version 4.0.xx)to use SingleR.

However, I can only load HumanPrimaryCellAtlasData. Other ref data like BluepringEncodeData, MonacoImmuneData, DatabaseImmuneCellExpressionData are failed to load resource.

Here is the error message I got.

library('SingleR') #dont use same with cell dex library('celldex') library('Seurat') hpca.se<-HumanPrimaryCellAtlasData() snapshotDate(): 2021-03-04 see ?celldex and browseVignettes('celldex') for documentation loading from cache see ?celldex and browseVignettes('celldex') for documentation loading from cache enco.bl <-BlueprintEncodeData(ensembl = TRUE) #fail load snapshotDate(): 2021-03-04 see ?celldex and browseVignettes('celldex') for documentation loading from cache Error: failed to load resource name: EH3486 title: Blueprint/Encode RNA-seq logcounts reason: unknown input format enco.bl <-BlueprintEncodeData() snapshotDate(): 2021-03-04 see ?celldex and browseVignettes('celldex') for documentation loading from cache Error: failed to load resource name: EH3486 title: Blueprint/Encode RNA-seq logcounts reason: unknown input format monaco <-MonacoImmuneData() #fail load snapshotDate(): 2021-03-04 see ?celldex and browseVignettes('celldex') for documentation loading from cache Error: failed to load resource name: EH3496 title: Monaco Immune Cell RNA-seq logcounts reason: unknown input format himmune<-DatabaseImmuneCellExpressionData() snapshotDate(): 2021-03-04 see ?celldex and browseVignettes('celldex') for documentation loading from cache Error: failed to load resource name: EH3488 title: DICE RNA-seq logcounts reason: unknown input format hpca.se class: SummarizedExperiment dim: 19363 713 metadata(0): assays(1): logcounts rownames(19363): A1BG A1BG-AS1 ... ZZEF1 ZZZ3 rowData names(0): colnames(713): GSM112490 GSM112491 ... GSM92233 GSM92234 colData names(3): label.main label.fine label.ont enco.bl Error: object 'enco.bl' not found monaco Error: object 'monaco' not found himmune Error: object 'himmune' not found

LTLA commented 3 years ago

Works fine for me. I note that your snapshot date indicates that you're somehow connecting to the devel version of ExperimentHub; my snapshot date on BioC-release (3.12) is 2020-10-27, while the snapshot date on BioC-devel (3.13) is 2021-03-04.

You should probably try to figure out what's going on there; seems like you're mixing BioC-devel and -release packages.

hypaik commented 3 years ago

Thank you for your prompt response. I removed the package of ExperimentHub, then re-install using BioC-release(3.12). But the problems are repeated.

Here are the code and errors.

if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") BiocManager::install(version = "3.12") BiocManager::install("celldex") BiocManager::install("SingleR") BiocManager::install("scRNAseq")

library('SingleR') library('celldex')

hpca.se<-HumanPrimaryCellAtlasData() snapshotDate(): 2020-10-27 see ?celldex and browseVignettes('celldex') for documentation loading from cache see ?celldex and browseVignettes('celldex') for documentation loading from cache enco.bl <-BlueprintEncodeData() snapshotDate(): 2020-10-27 see ?celldex and browseVignettes('celldex') for documentation loading from cache Error: failed to load resource name: EH3486 title: Blueprint/Encode RNA-seq logcounts reason: unknown input format monaco <-MonacoImmuneData() #fail load snapshotDate(): 2020-10-27 see ?celldex and browseVignettes('celldex') for documentation loading from cache Error: failed to load resource name: EH3496 title: Monaco Immune Cell RNA-seq logcounts reason: unknown input format himmune<-DatabaseImmuneCellExpressionData() snapshotDate(): 2020-10-27 see ?celldex and browseVignettes('celldex') for documentation loading from cache Error: failed to load resource name: EH3488 title: DICE RNA-seq logcounts reason: unknown input format

LTLA commented 3 years ago
  1. Show your sessionInfo().
  2. Fix any problems raised by BiocManager::valid().
hypaik commented 3 years ago

I can't find any problems raised by BiocManager::valid()

sessionInfo() R version 4.0.3 (2020-10-10) Platform: x86_64-apple-darwin17.0 (64-bit) Running under: macOS Catalina 10.15.6

Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

Random number generation: RNG: Mersenne-Twister Normal: Inversion Sample: Rounding

locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages: [1] parallel stats4 stats graphics grDevices utils datasets methods base

other attached packages: [1] celldex_1.0.0 SingleR_1.4.1 SummarizedExperiment_1.20.0 Biobase_2.50.0 GenomicRanges_1.42.0 GenomeInfoDb_1.26.2 IRanges_2.24.1
[8] S4Vectors_0.28.1 BiocGenerics_0.36.0 MatrixGenerics_1.2.1 matrixStats_0.58.0

loaded via a namespace (and not attached): [1] Rcpp_1.0.6 rsvd_1.0.3 lattice_0.20-41 digest_0.6.27 assertthat_0.2.1 utf8_1.1.4 mime_0.10
[8] BiocFileCache_1.14.0 R6_2.5.0 RSQLite_2.2.3 httr_1.4.2 pillar_1.5.1 sparseMatrixStats_1.2.1 zlibbioc_1.36.0
[15] rlang_0.4.10 curl_4.3 irlba_2.3.3 blob_1.2.1 Matrix_1.3-2 BiocNeighbors_1.8.2 BiocParallel_1.24.1
[22] AnnotationHub_2.22.0 RCurl_1.98-1.2 bit_4.0.4 beachmat_2.6.4 shiny_1.6.0 DelayedArray_0.16.2 httpuv_1.5.5
[29] compiler_4.0.3 BiocSingular_1.6.0 pkgconfig_2.0.3 htmltools_0.5.1.1 tidyselect_1.1.0 interactiveDisplayBase_1.28.0 tibble_3.1.0
[36] GenomeInfoDbData_1.2.4 fansi_0.4.2 withr_2.4.1 later_1.1.0.1 crayon_1.4.1 dplyr_1.0.5 dbplyr_2.1.0
[43] bitops_1.0-6 rappdirs_0.3.3 grid_4.0.3 xtable_1.8-4 lifecycle_1.0.0 DBI_1.1.1 magrittr_2.0.1
[50] cachem_1.0.4 XVector_0.30.0 promises_1.2.0.1 DelayedMatrixStats_1.12.3 ellipsis_0.3.1 generics_0.1.0 vctrs_0.3.6
[57] tools_4.0.3 bit64_4.0.5 glue_1.4.2 BiocVersion_3.12.0 purrr_0.3.4 fastmap_1.1.0 yaml_2.2.1
[64] AnnotationDbi_1.52.0 ExperimentHub_1.16.0 BiocManager_1.30.10 memoise_2.0.0

BiocManager::valid() [1] TRUE

hypaik commented 3 years ago

In addition, I also tried the same code with another lab member using her computer (Window system). She also failed. She can't load any data from HumanPrimaryCellAtlasData(), BlueprintEncodeData(), and MonacoImmuneData().

LTLA commented 3 years ago

¯\_(ツ)_/¯

I have no idea why. I've never seen this error before. Best guess is that the RDS file is corrupted or the wrong file was downloaded somehow, I'm assuming that the error is being thrown from readRDS() (see here).

I don't know why this would happen, but you may have to destroy your cache directory and redownload the files.

library(Experiment)
hubCache(ExperimentHub()) # delete this path

Perhaps @lshep may have seen this before.

hypaik commented 3 years ago

Dear Aaron

I already tried (removing caches), sigh... Now, cell dex can't load anything. All of the Cache files in the hubCache directory are too small. (just 640B) As you mentioned, I agree that something wrong happened in ExperimentHub.

So, I tested the following code. And I found that ExperimentHub doesn't work.

library('celldex') hpca.se<-celldex::HumanPrimaryCellAtlasData() #OK ~/Library/Caches/ExperimentHub does not exist, create directory? (yes/no): yes | | | 0% | |= | 0% | |= | 1% |l-=== snapshotDate(): 2020-10-27 see ?celldex and browseVignettes('celldex') for documentation downloading 1 resources retrieving 1 resource Downloading: 640 B Downloading: 640 B Downloading: 640 B Downloading: 640 B Downloading: 640 B Downloading: 640 B
loading from cache Error: failed to load resource name: EH3492 title: Human Primary Cell Atlas logcounts reason: unknown input format enco.bl <-celldex::BlueprintEncodeData() #fail load snapshotDate(): 2020-10-27 see ?celldex and browseVignettes('celldex') for documentation downloading 1 resources retrieving 1 resource Downloading: 640 B Downloading: 640 B Downloading: 640 B Downloading: 640 B Downloading: 640 B Downloading: 640 B
loading from cache Error: failed to load resource name: EH3486 title: Blueprint/Encode RNA-seq logcounts reason: unknown input format monaco <-celldex::MonacoImmuneData() #fail load snapshotDate(): 2020-10-27 see ?celldex and browseVignettes('celldex') for documentation downloading 1 resources retrieving 1 resource Downloading: 640 B Downloading: 640 B Downloading: 640 B Downloading: 640 B Downloading: 640 B Downloading: 640 B
loading from cache Error: failed to load resource name: EH3496 title: Monaco Immune Cell RNA-seq logcounts reason: unknown input format himmune<-celldex::DatabaseImmuneCellExpressionData() #fail load snapshotDate(): 2020-10-27 see ?celldex and browseVignettes('celldex') for documentation downloading 1 resources retrieving 1 resource Downloading: 640 B Downloading: 640 B Downloading: 640 B Downloading: 640 B Downloading: 640 B Downloading: 640 B
loading from cache Error: failed to load resource name: EH3488 title: DICE RNA-seq logcounts reason: unknown input format

library(ExperimentHub) Loading required package: AnnotationHub Loading required package: BiocFileCache Loading required package: dbplyr

Attaching package: ‘AnnotationHub’

The following object is masked from ‘package:Biobase’:

cache

eh=ExperimentHub() snapshotDate(): 2020-10-27 query(eh, "EH3492") ExperimentHub with 0 records

snapshotDate(): 2020-10-27

query(eh, "EH1") ExperimentHub with 0 records

snapshotDate(): 2020-10-27

myfiles <- query(eh, "PACKAGENAME") myfiles[[1]] Error in .local(x, i, j = j, ...) : 'i' must be length 1 myfiles ExperimentHub with 0 records

snapshotDate(): 2020-10-27

LTLA commented 3 years ago

I'll bet that those 640 bytes contain an error message with the real problem.

hypaik commented 3 years ago

Dear developer.

I tried another lab member's computer, and he can load the resource without any problem. This morning, I tried other methods, then I can load the resource. Here, is the method I've used.

  1. Delete celldex, SingleR, and reinstall everything.
  2. Destroy the directory of Experiment Hub Caches.
  3. Deleted .RData, .Rhistory files. And use a different working directory.
  4. Switch off the ethernet cable and activate the wifi connection of my computer. (My institute blocked AWS for security.)
  5. celldex worked properly. So, I think this issue associated with the firewall of my institute. However, it's still unclear how another lab member can load the resource via ethernet cable. I think I can figure it out with the IT administrator of my institute.

Now I have a quick question. Do I need to save the downloaded resource as an RData? Because the Wifi setting is slow in my institute, it seems uncomfortable to use wifi for every analysis.

LTLA commented 3 years ago

My institute blocked AWS for security.

I daresay this is the biggest problem. The reinstallations won't have any effect. The deletion of the RData/Rhistory is highly unlikely to have an effect. The deletion of the caches only has an effect with respect to getting rid of the weird 640 byte downloads that have contaminated your existing cache.

I can't comprehend why your institute thinks that blocking read access to AWS is a sensible thing to do. (ENCODE hosts all of their datasets on S3, for example.) That's a big world of data that's being blocked.

Now I have a quick question. Do I need to save the downloaded resource as an RData?

No. Once successfully downloaded, files are automatically cached. You won't have to download them again.

hypaik commented 3 years ago

Thank you for your response.

I totally agree that blocking AWS is the big issue. There are several ways to bypass the firewall. Based on your answer, I would not save the data as RData.

So, my problem is solved. I would close this issue.

Regards, HP