LTLA / scRNAseq

Clone of the Bioconductor repository for the scRNAseq package.
http://bioconductor.org/packages/devel/data/experiment/html/scRNAseq.html
24 stars 12 forks source link

'Invalid cross-device link' when loading data using gypsum backend #46

Closed grimbough closed 8 months ago

grimbough commented 8 months ago

When trying to test my changes to rhdf5 for #44 I found I can't run the example code to load PaulHSCData on my machine.

> z = PaulHSCData(legacy=FALSE)
HDF5: unable to open dataset
Error in h(simpleError(msg, call)) : 
  error in evaluating the argument 'seed' in selecting a method for function 'DelayedArray': Error in h5checktype(). Argument not of class H5IdComponent.
In addition: There were 11 warnings (use warnings() to see them)

I presume the warings are important here as they all relate to an 'Invalid cross-device link'. I've edited the output below for brevity, but the missing warnings are all similar just referencing different files.

> warnings()
Warning messages:
1: In file.rename(tmp, destination) :
  cannot rename file '/tmp/RtmpEvidZm/file62cba4b872a3' to '/home/msmith/.cache/R/gypsum/bucket/scRNAseq/paul-hsc-2015/2023-12-20/..manifest', reason 'Invalid cross-device link'
...
11: In file.rename(tmp, destination) :
  cannot rename file '/tmp/RtmpEvidZm/file62cba63594e7c' to '/home/msmith/.cache/R/gypsum/bucket/scRNAseq/paul-hsc-2015/2023-12-20/row_data/basic_columns.h5', reason 'Invalid cross-device link'

On my system the default cache location and tmp are on on different partitions, which feels like it might be related.

-> % df -h /home/msmith/ /tmp/
Filesystem      Size  Used Avail Use% Mounted on
/dev/nvme0n1p7  153G   93G   53G  65% /home
/dev/nvme0n1p6  130G   86G   38G  70% /

If I explicitly choose the gypsum cache directory to be on the same drive as /tmp/ then it seems to work fine:

gypsum::cacheDirectory( "/tmp/cache/")
 z = PaulHSCData(legacy = FALSE)

Here's the session info if that's helpful:

> sessionInfo()
R Under development (unstable) (2024-03-04 r86048)
Platform: x86_64-pc-linux-gnu
Running under: Linux Mint 21.3

Matrix products: default
BLAS:   /home/msmith/Applications/R/R-devel/lib/libRblas.so 
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=de_DE.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=de_DE.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C       

time zone: Europe/Berlin
tzcode source: system (glibc)

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] scRNAseq_2.17.3             SingleCellExperiment_1.25.0 SummarizedExperiment_1.33.3 Biobase_2.63.0             
 [5] GenomicRanges_1.55.3        GenomeInfoDb_1.39.8         IRanges_2.37.1              S4Vectors_0.41.4           
 [9] BiocGenerics_0.49.1         MatrixGenerics_1.15.0       matrixStats_1.2.0           rhdf5_2.47.6               
[13] testthat_3.2.1             

loaded via a namespace (and not attached):
  [1] rstudioapi_0.15.0        jsonlite_1.8.8           magrittr_2.0.3           GenomicFeatures_1.55.3  
  [5] gypsum_0.99.13           fs_1.6.3                 BiocIO_1.13.0            zlibbioc_1.49.0         
  [9] vctrs_0.6.5              memoise_2.0.1            Rsamtools_2.19.3         RCurl_1.98-1.14         
 [13] base64enc_0.1-3          htmltools_0.5.7          S4Arrays_1.3.6           usethis_2.2.3           
 [17] progress_1.2.3           AnnotationHub_3.11.1     curl_5.2.1               Rhdf5lib_1.25.1         
 [21] SparseArray_1.3.4        alabaster.base_1.3.21    alabaster.sce_1.3.3      htmlwidgets_1.6.4       
 [25] desc_1.4.3               httr2_1.0.0              cachem_1.0.8             GenomicAlignments_1.39.4
 [29] mime_0.12                lifecycle_1.0.4          pkgconfig_2.0.3          Matrix_1.6-5            
 [33] R6_2.5.1                 fastmap_1.1.1            GenomeInfoDbData_1.2.11  shiny_1.8.0             
 [37] digest_0.6.34            AnnotationDbi_1.65.2     rprojroot_2.0.4          ExperimentHub_2.11.1    
 [41] pkgload_1.3.4            aws.signature_0.6.0      RSQLite_2.3.5            filelock_1.0.3          
 [45] fansi_1.0.6              httr_1.4.7               abind_1.4-5              compiler_4.4.0          
 [49] remotes_2.4.2.1          bit64_4.0.5              withr_3.0.0              BiocParallel_1.37.1     
 [53] DBI_1.2.2                pkgbuild_1.4.3           alabaster.ranges_1.3.3   HDF5Array_1.31.6        
 [57] alabaster.schemas_1.3.1  mockery_0.4.4            biomaRt_2.59.1           rappdirs_0.3.3          
 [61] DelayedArray_0.29.9      sessioninfo_1.2.2        rjson_0.2.21             tools_4.4.0             
 [65] httpuv_1.6.14            glue_1.7.0               restfulr_0.0.15          rhdf5filters_1.15.2     
 [69] promises_1.2.1           grid_4.4.0               generics_0.1.3           ensembldb_2.27.1        
 [73] hms_1.1.3                xml2_1.3.6               utf8_1.2.4               XVector_0.43.1          
 [77] BiocVersion_3.19.1       pillar_1.9.0             stringr_1.5.1            later_1.3.2             
 [81] dplyr_1.1.4              BiocFileCache_2.11.1     lattice_0.22-5           rtracklayer_1.63.0      
 [85] bit_4.0.5                tidyselect_1.2.0         Biostrings_2.71.2        miniUI_0.1.1.1          
 [89] ProtGenerics_1.35.3      devtools_2.4.5           brio_1.1.4               stringi_1.8.3           
 [93] lazyeval_0.2.2           yaml_2.3.8               codetools_0.2-19         tibble_3.2.1            
 [97] alabaster.matrix_1.3.13  BiocManager_1.30.22      cli_3.6.2                xtable_1.8-4            
[101] Rcpp_1.0.12              dbplyr_2.4.0             png_0.1-8                XML_3.99-0.16.1         
[105] parallel_4.4.0           ellipsis_0.3.2           blob_1.2.4               prettyunits_1.2.0       
[109] aws.s3_0.3.21            profvis_0.3.8            AnnotationFilter_1.27.0  urlchecker_1.0.1        
[113] bitops_1.0-7             alabaster.se_1.3.4       purrr_1.0.2              crayon_1.5.2            
[117] rlang_1.1.3              KEGGREST_1.43.0          waldo_0.5.2             
LTLA commented 8 months ago

Hm. The warnings should have been "fine" as gypsum falls back to file.copy() when file.rename() doesn't work. If file.copy() didn't work either, there should have been an actual error before hitting HDF5.

gypsum 0.99.15 should fix it so that file.rename() succeeds and doesn't emit warnings; but I still don't know why the HDF5 call ultimately fails, unless your file.copy() is just quietly failing.

Do the files like /home/msmith/.cache/R/gypsum/bucket/scRNAseq/paul-hsc-2015/2023-12-20/row_data/basic_columns.h5 actually exist?

grimbough commented 8 months ago

So I restarted everything and cleared all the caches, and now can't reproduce the error.

My first thought was that perhaps I had some funky version of rhdf5 loaded inadvertantly, since I'd been working on that, but that wouldn't explain why it worked when I switched cache location to the same drive :shrug:

Frustrating that I can't give any more details, but hopefully it was just a one off transient set of circumstances. I'll close for now and re-open if I encounter it again.