grimbough / Rhdf5lib

Distribution of the HDF5 library in an R package
https://bioconductor.org/packages/Rhdf5lib/
6 stars 14 forks source link

Cannot open HDF5 files using HDF5 1.10.x on some machines #11

Closed PeteHaitch closed 5 years ago

PeteHaitch commented 6 years ago

Hi Mike,

I'm unable to open (specifically, I can't rhdf5::h5ls()) an existing HDF5 file using the new 1.10.x-packaged version of HDF5 in Rhdf5lib. This is on a Linux HPC with a shared file system (Lustre).

Notably:

This particular file was created via HDF5Array and when it was created I think the system was linked against the C-library HDF5 v1.10.2 (do you know of some way to check this for a particular file?). This particular file is 3.9 GB; I'd be happy to share it with you if it would help in debugging.

Frustratingly, on the same machine, running the file created by example(h5ls) using BioC 3.7 and then loading the file using BioC 3.8 does work, which is making debugging rather difficult.

I can re-create this particular file but I'd really like to avoid it if possible (there are tens of other files created around a similar time that I think have the same issue).

I appreciate any suggestions and advice you can offer, Pete

# BioC 3.7

> library(rhdf5)
> file <- "eGTEx/HDF5/extdata/Bulk_GTEx_Brain_hg38/BSseq/eGTEx.Phase1_brain_samples.mCA_pos/collapseBSseq/assays.h5"
> h5ls(file)
# group name       otype  dclass   dim
# 0     /  Cov H5I_DATASET INTEGER  x 32
# 1     /    M H5I_DATASET INTEGER  x 32
> devtools::session_info()
Session info ------------------------------------------------------------------
 setting  value
 version  R version 3.5.0 Patched (2018-04-30 r74679)
 system   x86_64, linux-gnu
 ui       X11
 language (EN)
 collate  en_US.UTF-8
 tz       US/Eastern
 date     2018-09-03

Packages ----------------------------------------------------------------------
 package   * version date       source
 base      * 3.5.0   2018-05-02 local
 compiler    3.5.0   2018-05-02 local
 datasets  * 3.5.0   2018-05-02 local
 devtools    1.13.6  2018-06-27 CRAN (R 3.5.0)
 digest      0.6.15  2018-01-28 CRAN (R 3.5.0)
 graphics  * 3.5.0   2018-05-02 local
 grDevices * 3.5.0   2018-05-02 local
 memoise     1.1.0   2017-04-21 CRAN (R 3.5.0)
 methods   * 3.5.0   2018-05-02 local
 rhdf5     * 2.24.0  2018-05-02 Bioconductor
 Rhdf5lib    1.2.1   2018-05-17 Bioconductor
 stats     * 3.5.0   2018-05-02 local
 utils     * 3.5.0   2018-05-02 local
 withr       2.1.2   2018-03-15 CRAN (R 3.5.0)

# BioC 3.8

> library(rhdf5)
> file <- "eGTEx/HDF5/extdata/Bulk_GTEx_Brain_hg38/BSseq/eGTEx.Phase1_brain_samples.mCA_pos/collapseBSseq/assays.h5"
> h5ls(file)
Error in H5Fopen(file, "H5F_ACC_RDONLY", native = native) :
  HDF5. File accessibilty. Unable to open file.
> devtools::session_info()
Session info ------------------------------------------------------------------
 setting  value
 version  R version 3.5.1 Patched (2018-07-02 r74950)
 system   x86_64, linux-gnu
 ui       X11
 language (EN)
 collate  en_US.UTF-8
 tz       US/Eastern
 date     2018-09-03

Packages ----------------------------------------------------------------------
 package   * version date       source
 base      * 3.5.1   2018-07-09 local
 compiler    3.5.1   2018-07-09 local
 datasets  * 3.5.1   2018-07-09 local
 devtools    1.13.6  2018-06-27 CRAN (R 3.5.1)
 digest      0.6.15  2018-01-28 CRAN (R 3.5.0)
 graphics  * 3.5.1   2018-07-09 local
 grDevices * 3.5.1   2018-07-09 local
 memoise     1.1.0   2017-04-21 CRAN (R 3.5.0)
 methods   * 3.5.1   2018-07-09 local
 rhdf5     * 2.25.8  2018-08-30 Bioconductor
 Rhdf5lib    1.3.2   2018-08-23 Bioconductor
 stats     * 3.5.1   2018-07-09 local
 utils     * 3.5.1   2018-07-09 local
 withr       2.1.2   2018-03-15 CRAN (R 3.5.0)

# BioC 3.8 (+ GitHub)

> library(rhdf5)
> file <- "eGTEx/HDF5/extdata/Bulk_GTEx_Brain_hg38/BSseq/eGTEx.Phase1_brain_samples.mCA_pos/collapseBSseq/assays.h5"
> h5ls(file)
Error in H5Fopen(file, "H5F_ACC_RDONLY", native = native) :
  HDF5. File accessibilty. Unable to open file.

> devtools::session_info()
Session info ------------------------------------------------------------------
 setting  value
 version  R version 3.5.1 Patched (2018-07-02 r74950)
 system   x86_64, linux-gnu
 ui       X11
 language (EN)
 collate  en_US.UTF-8
 tz       US/Eastern
 date     2018-09-03

Packages ----------------------------------------------------------------------
 package   * version date       source
 base      * 3.5.1   2018-07-09 local
 compiler    3.5.1   2018-07-09 local
 datasets  * 3.5.1   2018-07-09 local
 devtools    1.13.6  2018-06-27 CRAN (R 3.5.1)
 digest      0.6.15  2018-01-28 CRAN (R 3.5.0)
 graphics  * 3.5.1   2018-07-09 local
 grDevices * 3.5.1   2018-07-09 local
 memoise     1.1.0   2017-04-21 CRAN (R 3.5.0)
 methods   * 3.5.1   2018-07-09 local
 rhdf5     * 2.25.8  2018-09-03 Github (grimbough/rhdf5@168e006)
 Rhdf5lib    1.3.3   2018-09-03 Github (grimbough/Rhdf5lib@59c8299)
 stats     * 3.5.1   2018-07-09 local
 utils     * 3.5.1   2018-07-09 local
 withr       2.1.2   2018-03-15 CRAN (R 3.5.0)
PeteHaitch commented 6 years ago

More digging makes me think it's an issue with the shared file system and/or permissions on this Linux machine in conjunction with some different between C-library HDF5 1.8.x instead of 1.10.x.

Using the example file created by example("h5ls") and rhdf5 2.25.8 linking to C-library HDF5 1.10.2, rhdf5::h5ls():

  1. Works when the file lives in my user area (ZFS disk)
  2. Works when the file lives in a temp area (Lustre disk)
  3. Does not work when the file lives in a lab area (Lustre disk)

Using the example file created by example("h5ls") and rhdf5 2.24.0 linking to C-library HDF5 1.8.19:

  1. Works when the file lives in my user area (ZFS disk)
  2. Works when the file lives in a temp area (Lustre disk)
  3. Works when the file lives in a lab area (Lustre disk)

(note to self: info on disks and file systems from https://jhpce.jhu.edu/policies/current-storage-offerings/).

I don't get why (3) is different when linking to C-library HDF5 1.8.x instead of 1.10.x. Any ideas?

I'll also discuss with our sysadmin to try to get (3) working for C-library HDF5 1.10.x and report back so as to close the issue.

grimbough commented 6 years ago

Perhaps there are different file system settings on the two Lustre areas.

Does it work on Lustre if you set the environment variable HDF5_USE_FILE_LOCKING to FALSE ?

PeteHaitch commented 6 years ago

It does! Is that the remedy or just an indication of a deeper issue?

PeteHaitch commented 6 years ago

note to self: http://hdf-forum.184993.n3.nabble.com/HDF5-files-on-NFS-td4029577.html discusses the environment variable HDF5_USE_FILE_LOCKING

PeteHaitch commented 6 years ago

Bumping https://github.com/grimbough/Rhdf5lib/issues/11#issuecomment-418314774

Is that the remedy or just an indication of a deeper issue?

grimbough commented 6 years ago

I think that's the (slightly unsatisfactory) remedy. It looks like this only happens on Lustre files systems, and then it's clearly not on all of them since it works in your temp area, so there's some additional configuration parameters influencing this too.

I don't have access to any Lustre system to try things myself, but I presume this isn't even a potential problem unless you're trying to access a file multiple times. I'll add something to the package vignette, and keep an eye out on the HDF5 noticeboard for additionally info on this, but I don't have a better solution right now.

PeteHaitch commented 6 years ago

Thanks, Mike! If I get time I can investigate further on the Lustre filesystem. @kasperdanielhansen also may be interested in better understanding this issue and has access to the same system.

kasperdanielhansen commented 6 years ago

Pretty important issue and solution.

Obviously, as HDF5 usage gets more fundamental, we need to have these files work across all filesystems and disks. I would be happy to test packages.

urbach commented 5 years ago

I am having similar problems on a lustre file system:

> devtools::session_info()
─ Session info       ───────────────────────────────────────────────────────────────
 setting  value                       
 version  R version 3.5.2 (2018-12-20)
 os       Debian GNU/Linux 9 (stretch)
 system   x86_64, linux-gnu           
 ui       X11                         
 language en_US:en                    
 collate  en_US.UTF-8                 
 ctype    en_US.UTF-8                 
 tz       Europe/Berlin               
 date     2019-05-14                  

─ Packages ───────────────────────────────────────────────────────────────────
 package     * version date       lib source                             
 assertthat    0.2.1   2019-03-21 [1] CRAN (R 3.5.2)                     
 backports     1.1.4   2019-04-10 [1] CRAN (R 3.5.2)                     
 callr         3.2.0   2019-03-15 [1] CRAN (R 3.5.2)                     
 cli           1.1.0   2019-03-19 [1] CRAN (R 3.5.2)                     
 crayon        1.3.4   2017-09-16 [1] CRAN (R 3.5.2)                     
 desc          1.2.0   2018-05-01 [1] CRAN (R 3.5.2)                     
 devtools    * 2.0.2   2019-04-08 [1] CRAN (R 3.5.2)                     
 digest        0.6.18  2018-10-10 [1] CRAN (R 3.5.2)                     
 fs            1.3.1   2019-05-06 [1] CRAN (R 3.5.2)                     
 glue          1.3.1   2019-03-12 [1] CRAN (R 3.5.2)                     
 magrittr      1.5     2014-11-22 [1] CRAN (R 3.5.2)                     
 memoise       1.1.0   2017-04-21 [1] CRAN (R 3.5.2)                     
 pkgbuild      1.0.3   2019-03-20 [1] CRAN (R 3.5.2)                     
 pkgload       1.0.2   2018-10-29 [1] CRAN (R 3.5.2)                     
 prettyunits   1.0.2   2015-07-13 [1] CRAN (R 3.5.2)                     
 processx      3.3.1   2019-05-08 [1] CRAN (R 3.5.2)                     
 ps            1.3.0   2018-12-21 [1] CRAN (R 3.5.2)                     
 R6            2.4.0   2019-02-14 [1] CRAN (R 3.5.2)                     
 Rcpp          1.0.1   2019-03-17 [1] CRAN (R 3.5.2)                     
 remotes       2.0.4   2019-04-10 [1] CRAN (R 3.5.2)                     
 rhdf5       * 2.27.19 2019-05-14 [1] Github (grimbough/rhdf5@53f3bec)   
 Rhdf5lib      1.7.1   2019-05-14 [1] Github (grimbough/Rhdf5lib@35d7aef)
 rlang         0.3.4   2019-04-07 [1] CRAN (R 3.5.2)                     
 rprojroot     1.3-2   2018-01-03 [1] CRAN (R 3.5.2)                     
 sessioninfo   1.1.1   2018-11-05 [1] CRAN (R 3.5.2)                     
 usethis     * 1.5.0   2019-04-07 [1] CRAN (R 3.5.2)                     
 withr         2.1.2   2018-03-15 [1] CRAN (R 3.5.2)                     

[1] /hadron/urbach/lib/R
[2] /usr/lib/R/site-library
[3] /hadron/urbach/R/x86_64-pc-linux-gnu-library/3.5
[4] /usr/local/lib/R/site-library
[5] /usr/lib/R/library

h5ls works

> file <- ("/hiskp4/bartek/flavor_singlet/twopt_funs/cA211a.30.32/0000/corr.0000.t15.h5")
> h5ls(file)
                                              group               name       otype
0                                             /           d+-g-u-g   H5I_GROUP
1                                     /d+-g-u-g                t15   H5I_GROUP
2                                 /d+-g-u-g/t15                gf4   H5I_GROUP
[...]

but, H5Fopen doesn't

> h5file <- H5Fopen(file)
Error in H5Fopen(file) : HDF5. File accessibilty. Unable to open file.

Any help would be highly welcome! Setting HDF5_USE_FILE_LOCKING to FALSE does not have any effect on this.

Thanks, Carsten

grimbough commented 5 years ago

Weird that h5ls() works, since that's opening the file too. Does running the command rhdf5::h5disableFileLocking() improve anything?

You can also get slightly more information on the error by setting rhdf5::h5errorHandling('verbose') before running the failing code.

urbach commented 5 years ago

Hi Mike, thanks for your quick reply. Unfortunately, rhdf5::h5disableFileLocking() does not help:

> tst <- H5Fopen(file)
Error in H5Fopen(file) : HDF5. File accessibilty. Unable to open file.
> rhdf5::h5disableFileLocking()
> tst <- H5Fopen(file)
Error in H5Fopen(file) : HDF5. File accessibilty. Unable to open file.
> rhdf5::h5errorHandling('verbose')
> tst <- H5Fopen(file)
Error in H5Fopen(file) : libhdf5
    error #000: ����)V in �"$�)V(): line 509
        class: HDF5
        major: File accessibilty
        minor: Unable to open file
    error #-04:  in (): line 1400
        class: HDF5
        major: File accessibilty
        minor: Unable to open file
    error #-03: @;$�)V in (): line 1546
        class: HDF5
        major: File accessibilty
        minor: Unable to open file
    error #-02: p�ξ)V in ��)V(): line 734
        class: HDF5
        major: Virtual File Layer
        minor: Unable to initialize object
    error #-01: ���)V in (): line 346
        class: HDF5
        major: File accessibilty
        minor: Unable to open file

Does this help (apart from some locale problem, where I don't know where it comes from)?

Are the versions of rhdf5 and Rhdf5lib I use compatible?

urbach commented 5 years ago

A little more info: The problem occurs on our lustre network FS (connected via infiniband). When I copy the file to a local disk, I can open the file without problems.

urbach commented 5 years ago

okay, I solved this. I misunderstood flags=h5default("H5F_ACC_RD") to be open for reading. I had no write permissions for the original file. When using

tst <- H5Fopen(file, flags="H5F_ACC_RDONLY")

everything works as expected.

urbach commented 5 years ago

thanks for the help again!

grimbough commented 5 years ago

Thanks for reporting back. HDF5's error messages leave a lot to be desired! It would be nice for that to have returned a 'permission denied' error an save you a bunch of time.

urbach commented 5 years ago

yes, this had helped in my case... ;)