Closed PeteHaitch closed 5 years ago
More digging makes me think it's an issue with the shared file system and/or permissions on this Linux machine in conjunction with some different between C-library HDF5 1.8.x instead of 1.10.x.
Using the example file created by example("h5ls")
and rhdf5 2.25.8 linking to C-library HDF5 1.10.2, rhdf5::h5ls()
:
Using the example file created by example("h5ls")
and rhdf5 2.24.0 linking to C-library HDF5 1.8.19:
(note to self: info on disks and file systems from https://jhpce.jhu.edu/policies/current-storage-offerings/).
I don't get why (3) is different when linking to C-library HDF5 1.8.x instead of 1.10.x. Any ideas?
I'll also discuss with our sysadmin to try to get (3) working for C-library HDF5 1.10.x and report back so as to close the issue.
Perhaps there are different file system settings on the two Lustre areas.
Does it work on Lustre if you set the environment variable HDF5_USE_FILE_LOCKING
to FALSE
?
It does! Is that the remedy or just an indication of a deeper issue?
note to self: http://hdf-forum.184993.n3.nabble.com/HDF5-files-on-NFS-td4029577.html discusses the environment variable HDF5_USE_FILE_LOCKING
Bumping https://github.com/grimbough/Rhdf5lib/issues/11#issuecomment-418314774
Is that the remedy or just an indication of a deeper issue?
I think that's the (slightly unsatisfactory) remedy. It looks like this only happens on Lustre files systems, and then it's clearly not on all of them since it works in your temp area, so there's some additional configuration parameters influencing this too.
I don't have access to any Lustre system to try things myself, but I presume this isn't even a potential problem unless you're trying to access a file multiple times. I'll add something to the package vignette, and keep an eye out on the HDF5 noticeboard for additionally info on this, but I don't have a better solution right now.
Thanks, Mike! If I get time I can investigate further on the Lustre filesystem. @kasperdanielhansen also may be interested in better understanding this issue and has access to the same system.
Pretty important issue and solution.
Obviously, as HDF5 usage gets more fundamental, we need to have these files work across all filesystems and disks. I would be happy to test packages.
I am having similar problems on a lustre file system:
> devtools::session_info()
─ Session info ───────────────────────────────────────────────────────────────
setting value
version R version 3.5.2 (2018-12-20)
os Debian GNU/Linux 9 (stretch)
system x86_64, linux-gnu
ui X11
language en_US:en
collate en_US.UTF-8
ctype en_US.UTF-8
tz Europe/Berlin
date 2019-05-14
─ Packages ───────────────────────────────────────────────────────────────────
package * version date lib source
assertthat 0.2.1 2019-03-21 [1] CRAN (R 3.5.2)
backports 1.1.4 2019-04-10 [1] CRAN (R 3.5.2)
callr 3.2.0 2019-03-15 [1] CRAN (R 3.5.2)
cli 1.1.0 2019-03-19 [1] CRAN (R 3.5.2)
crayon 1.3.4 2017-09-16 [1] CRAN (R 3.5.2)
desc 1.2.0 2018-05-01 [1] CRAN (R 3.5.2)
devtools * 2.0.2 2019-04-08 [1] CRAN (R 3.5.2)
digest 0.6.18 2018-10-10 [1] CRAN (R 3.5.2)
fs 1.3.1 2019-05-06 [1] CRAN (R 3.5.2)
glue 1.3.1 2019-03-12 [1] CRAN (R 3.5.2)
magrittr 1.5 2014-11-22 [1] CRAN (R 3.5.2)
memoise 1.1.0 2017-04-21 [1] CRAN (R 3.5.2)
pkgbuild 1.0.3 2019-03-20 [1] CRAN (R 3.5.2)
pkgload 1.0.2 2018-10-29 [1] CRAN (R 3.5.2)
prettyunits 1.0.2 2015-07-13 [1] CRAN (R 3.5.2)
processx 3.3.1 2019-05-08 [1] CRAN (R 3.5.2)
ps 1.3.0 2018-12-21 [1] CRAN (R 3.5.2)
R6 2.4.0 2019-02-14 [1] CRAN (R 3.5.2)
Rcpp 1.0.1 2019-03-17 [1] CRAN (R 3.5.2)
remotes 2.0.4 2019-04-10 [1] CRAN (R 3.5.2)
rhdf5 * 2.27.19 2019-05-14 [1] Github (grimbough/rhdf5@53f3bec)
Rhdf5lib 1.7.1 2019-05-14 [1] Github (grimbough/Rhdf5lib@35d7aef)
rlang 0.3.4 2019-04-07 [1] CRAN (R 3.5.2)
rprojroot 1.3-2 2018-01-03 [1] CRAN (R 3.5.2)
sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 3.5.2)
usethis * 1.5.0 2019-04-07 [1] CRAN (R 3.5.2)
withr 2.1.2 2018-03-15 [1] CRAN (R 3.5.2)
[1] /hadron/urbach/lib/R
[2] /usr/lib/R/site-library
[3] /hadron/urbach/R/x86_64-pc-linux-gnu-library/3.5
[4] /usr/local/lib/R/site-library
[5] /usr/lib/R/library
h5ls
works
> file <- ("/hiskp4/bartek/flavor_singlet/twopt_funs/cA211a.30.32/0000/corr.0000.t15.h5")
> h5ls(file)
group name otype
0 / d+-g-u-g H5I_GROUP
1 /d+-g-u-g t15 H5I_GROUP
2 /d+-g-u-g/t15 gf4 H5I_GROUP
[...]
but, H5Fopen
doesn't
> h5file <- H5Fopen(file)
Error in H5Fopen(file) : HDF5. File accessibilty. Unable to open file.
Any help would be highly welcome! Setting HDF5_USE_FILE_LOCKING
to FALSE
does not have any effect on this.
Thanks, Carsten
Weird that h5ls()
works, since that's opening the file too. Does running the command rhdf5::h5disableFileLocking()
improve anything?
You can also get slightly more information on the error by setting rhdf5::h5errorHandling('verbose')
before running the failing code.
Hi Mike, thanks for your quick reply. Unfortunately, rhdf5::h5disableFileLocking()
does not help:
> tst <- H5Fopen(file)
Error in H5Fopen(file) : HDF5. File accessibilty. Unable to open file.
> rhdf5::h5disableFileLocking()
> tst <- H5Fopen(file)
Error in H5Fopen(file) : HDF5. File accessibilty. Unable to open file.
> rhdf5::h5errorHandling('verbose')
> tst <- H5Fopen(file)
Error in H5Fopen(file) : libhdf5
error #000: ����)V in �"$�)V(): line 509
class: HDF5
major: File accessibilty
minor: Unable to open file
error #-04: in (): line 1400
class: HDF5
major: File accessibilty
minor: Unable to open file
error #-03: @;$�)V in (): line 1546
class: HDF5
major: File accessibilty
minor: Unable to open file
error #-02: p�ξ)V in ��)V(): line 734
class: HDF5
major: Virtual File Layer
minor: Unable to initialize object
error #-01: ���)V in (): line 346
class: HDF5
major: File accessibilty
minor: Unable to open file
Does this help (apart from some locale problem, where I don't know where it comes from)?
Are the versions of rhdf5
and Rhdf5lib
I use compatible?
A little more info: The problem occurs on our lustre network FS (connected via infiniband). When I copy the file to a local disk, I can open the file without problems.
okay, I solved this. I misunderstood flags=h5default("H5F_ACC_RD")
to be open for reading. I had no write permissions for the original file. When using
tst <- H5Fopen(file, flags="H5F_ACC_RDONLY")
everything works as expected.
thanks for the help again!
Thanks for reporting back. HDF5's error messages leave a lot to be desired! It would be nice for that to have returned a 'permission denied' error an save you a bunch of time.
yes, this had helped in my case... ;)
Hi Mike,
I'm unable to open (specifically, I can't
rhdf5::h5ls()
) an existing HDF5 file using the new 1.10.x-packaged version of HDF5 in Rhdf5lib. This is on a Linux HPC with a shared file system (Lustre).Notably:
This particular file was created via HDF5Array and when it was created I think the system was linked against the C-library HDF5 v1.10.2 (do you know of some way to check this for a particular file?). This particular file is 3.9 GB; I'd be happy to share it with you if it would help in debugging.
Frustratingly, on the same machine, running the file created by
example(h5ls)
using BioC 3.7 and then loading the file using BioC 3.8 does work, which is making debugging rather difficult.I can re-create this particular file but I'd really like to avoid it if possible (there are tens of other files created around a similar time that I think have the same issue).
I appreciate any suggestions and advice you can offer, Pete