openPMD / openPMD-api

:floppy_disk: C++ & Python API for Scientific I/O
https://openpmd-api.readthedocs.io
GNU Lesser General Public License v3.0
138 stars 51 forks source link

Initializing a Series fails. File can't be open. #390

Closed pordyna closed 5 years ago

pordyna commented 5 years ago

Initializing a Series fails and an Exception occurs: RuntimeError: Failed to open HDF5 [correct path to the file] Somehow it works with the exemplary openPMD data set from here [https://github.com/openPMD/openPMD-example-datasets]. It fails, when I'm trying to load the whole series with '%T' , but also when I pass a path, to a single file with a specific iteration.

To Reproduce Python:

import openPMD
path = '/bigdata/hplsim/scratch/garten70/PIConGPU/simOutput/h5/simData_%T.h5'
 # or any other  explicit iteration. 
series = openPMD.Series(path, openPMD.Access_Type.read_only)

Expected behavior The Series object should be correctly initialized.

Software Environment:

@ax3l

ax3l commented 5 years ago

Hi @pordyna, thank you for the report!

I saw the same problem recently #382 and just fixed it in #388. Update: fixed in openPMD 0.6.3-alpha

This is not yet in a release, but if you want to work-around this until it is you can do either:

n01r commented 5 years ago

Hey @pordyna, can you document again what doesn't work?

The following works for me from a notebook started with the HZDR service


import openPMD
path = '/bigdata/hplsim/scratch/garten70/PIConGPU/159_FixPhaseSpaceMomentumMetaData/simOutput/h5/simData_%04T.h5'
series = openPMD.Series(path, openPMD.Access_Type.read_only)

for k_i, i in series.iterations.items():
    print("Iteration: {0}".format(k_i))
# output 
Iteration: 1000
Iteration: 1250
Iteration: 1500
Iteration: 1750
Iteration: 2000
Iteration: 2250
Iteration: 2500
Iteration: 2750
Iteration: 3000
Iteration: 3250
Iteration: 3500
Iteration: 3750
Iteration: 4000
Iteration: 4250
Iteration: 4500
Iteration: 4750
Iteration: 5000
pordyna commented 5 years ago

While using the HZDR service, i couldn't import the API. It appears to not be installed properly.

import openPMD

results in: ImportError: libhdf5.so.10: cannot open shared object file: No such file or directory

@n01r, Have you installed openpmd-api on the cluster locally? How did you install it, and which version are your running?

The exact thing, which works for you, wont work with Jupyter running on my laptop. I access the files through sshfs.

import openPMD
path = '/bigdata/hplsim/scratch/garten70/PIConGPU/159_FixPhaseSpaceMomentumMetaData/simOutput/h5/simData_%04T.h5'
series = openPMD.Series(path, openPMD.Access_Type.read_only)

results in:

RuntimeError: Failed to open HDF5 file /home/pawel/Work/PIConGPU/159_FixPhaseSpaceMomentumMetaData/simOutput/h5/simData_3500.h5

ax3l commented 5 years ago

@pordyna you did everything correct, please install via conda.

There is also one outdated version swirrling around at our cluster (in the sys dirs), which should never have been there outside of a module or user's home...

@n01r your example works because it skips all output < 1000. See this comment.

n01r commented 5 years ago

Yes, I saw this - but my example works also because I have a jupyter.profile lying around from the PIConGPU workshop.

In there we have

module purge
module load python/3.6.2
module load cmake/3.10.1
module load gcc/4.9.2
module load zlib/1.2.8
module load hdf5/1.8.14

# PIConGPU modules
export PICSRC=/home/$(whoami)/src/picongpu
export PYTHONPATH=$PICSRC/lib/python:$PYTHONPATH

with the newer version of HDF5 on the cluster it doesn't work. I was just about to open a ticket.

ax3l commented 5 years ago

I was just about to open a ticket.

Please don't, the problem is on the user profile side.

Please keep in your both jupyter.profile:

module purge
module load python/3.6.2
module load cmake/3.10.1
module load gcc/4.9.2
module load zlib/1.2.8
module load hdf5/1.8.14

# PIConGPU modules
export PICSRC=/home/$(whoami)/src/picongpu
export PYTHONPATH=$PICSRC/lib/python:$PYTHONPATH

then do for install once:

source ~/jupyter.profile
# builds from source
pip install -U --user openpmd-api

If that works for you, you do not only get a working version but also the error reported in the PR description will be fixed: I just released openPMD-api 0.6.3-alpha with the fix for you.

pordyna commented 5 years ago

Thank you @ax3l. I don't have the jupyter.profile file. There is own.module in laser018:/data/home/<user_name>, which looks like that:

#%Module1.0#####################################################################
##
## own modulefile
##

setenv MODULES_NO_OUTPUT "1"

# modules to be loaded
module purge
#module load intel/17.2
#module load gcc/6.2.0
#module load openmpi/1.10.2

# global variables for internal script use
set     version       1.0
set     modulename    [ module-info name ]

# what-is
module-whatis   "This loads own modules configurations."

unsetenv MODULES_NO_OUTPUT 

How should jupyter.profile look like?

ax3l commented 5 years ago

That own.module looks good - keep it as empty as it currently is. Just create a text file $HOME/jupyter.profile with the following content:

module purge
module load python/3.6.2
module load cmake/3.10.1
module load gcc/4.9.2
module load zlib/1.2.8
module load hdf5/1.8.14

# PIConGPU modules (please change if different)
export PICSRC=$HOME/src/picongpu
export PYTHONPATH=$PICSRC/lib/python:$PYTHONPATH

That's all, then you can follow the instructions above to install openPMD-api from source via pip :)

The $HOME/jupyter.profile is sourced ("activated") on hypnos when you start the jupyter notebook service.

ax3l commented 5 years ago

@pordyna does it work for you? :)

pordyna commented 5 years ago

@ax3l , well I managed to install it, but somehow still not. First issue

import openPMD

works just fine, when I run it through python3 in terminal. In a jupyter notebook, I still get this

ImportError                               Traceback (most recent call last)
<ipython-input-1-129ddf26a4fb> in <module>()
----> 1 import openPMD

ImportError: libhdf5.so.10: cannot open shared object file: No such file or directory

Second issue When I run

import openPMD
path = '/bigdata/hplsim/scratch/garten70/PIConGPU/159_FixPhaseSpaceMomentumMetaData/simOutput/h5/simData_3500.h5'
series = openPMD.Series(path, openPMD.Acces_Type.read_only)

through python3 interpreter in the terminal, I get this:

HDF5-DIAG: Error detected in HDF5 (1.8.14) thread 0:
  #000: H5F.c line 604 in H5Fopen(): unable to open file
    major: File accessibilty
    minor: Unable to open file
  #001: H5Fint.c line 990 in H5F_open(): unable to open file: time = Tue Nov 13 11:16:42 2018
, name = '/bigdata/hplsim/scratch/garten70/PIConGPU/159_FixPhaseSpaceMomentumMetaData/simOutput/h5/simData_3500.h5', tent_flags = 1
    major: File accessibilty
    minor: Unable to open file
  #002: H5FD.c line 992 in H5FD_open(): open failed
    major: Virtual File Layer
    minor: Unable to initialize object
  #003: H5FDsec2.c line 343 in H5FD_sec2_open(): unable to open file: name = '/bigdata/hplsim/scratch/garten70/PIConGPU/159_FixPhaseSpaceMomentumMetaData/simOutput/h5/simData_3500.h5', errno = 13, error message = 'Permission denied', flags = 1, o_flags = 2
    major: File accessibilty
    minor: Unable to open file
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: Failed to open HDF5 file /bigdata/hplsim/scratch/garten70/PIConGPU/159_FixPhaseSpaceMomentumMetaData/simOutput/h5/simData_3500.h5

Same thing occurs for other iterations.

ax3l commented 5 years ago

Did you restart the jupyter notebook after you re-installed openPMD-api? Is it a Python3 notebook?

please show me what

# please log in with a new session into hypnos first
source ~/jupyter.profile
ls $HOME/.local/lib/python3.6/site-packages/openPMD*
ldd $HOME/.local/lib/python3.6/site-packages/openPMD.cpython-36m-x86_64-linux-gnu.so

python --version
which python
python -c "import openPMD"

python -c 'import openPMD; openPMD.Series("/bigdata/hplsim/scratch/garten70/PIConGPU/159_FixPhaseSpaceMomentumMetaData/simOutput/h5/simData_%T.h5", openPMD.Access_Type.read_only)'

shows when executed in the terminal.

With a a newly started Jupyter Python3 kernel (close all previous instances and stop & restart service first), what do two cells with input

!ldd /home/ordyna35/.local/lib/python3.6/site-packages/openPMD.cpython-36m-x86_64-linux-gnu.so
import openPMD

return us?

Second issue: are you sure a

ls -hal /bigdata/hplsim/scratch/garten70/PIConGPU/159_FixPhaseSpaceMomentumMetaData/simOutput/h5/simData_3500.h5
cd /bigdata/hplsim/scratch/garten70/PIConGPU/159_FixPhaseSpaceMomentumMetaData/simOutput/h5/
cd -

works under your permissions already?

n01r commented 5 years ago

I set the permissions of the whole simulation directory to g=u-w before. Should work ...

ax3l commented 5 years ago

But maybe it was not

n01r commented 5 years ago

It was definitely recursive but I'll go and double-check.

Weird though that his import openPMD in the notebook is now asking for libhdf5.so.10 while when it didn't work for me before it was asking for libhdf5.so.9.

pordyna commented 5 years ago

It seems to work?

ordyna35@laser039:/data/home/ordyna35$ ls -hal /bigdata/hplsim/scratch/garten70/PIConGPU/159_FixPhaseSpaceMomentumMetaData/simOutput/h5/simData_3500.h5
-rw-r----- 1 garten70 fwt 442M Jul 27 14:01 /bigdata/hplsim/scratch/garten70/PIConGPU/159_FixPhaseSpaceMomentumMetaData/simOutput/h5/simData_3500.h5
ax3l commented 5 years ago

Does less /bigdata/hplsim/scratch/garten70/PIConGPU/159_FixPhaseSpaceMomentumMetaData/simOutput/h5/simData_3500.h5 work? (:q to exit)

And

cd /bigdata/hplsim/scratch/garten70/PIConGPU/159_FixPhaseSpaceMomentumMetaData/simOutput/h5/
cd -
ax3l commented 5 years ago

@pordyna please log out and in to hypnos again for a fresh session, then can you please provide the output of https://github.com/openPMD/openPMD-api/issues/390#issuecomment-438221270 ?

pordyna commented 5 years ago

@ax3l So logged out, and in and that's what I get:

ordyna35@laser041:/data/home/ordyna35$ source ~/jupyter.profile

        This module will set up environment variables for python/3.6.2.

        This module will set up environment variables for cmake/3.10.1.

        This module will set up environment variables for gcc/4.9.2.

        This module will set up environment variables for zlib/1.2.8.

        This module will set up environment variables for hdf5/1.8.14.
ordyna35@laser041:/data/home/ordyna35$ ls $HOME/.local/lib/python3.6/site-packages/openPMD*
ls: cannot access /home/ordyna35/.local/lib/python3.6/site-packages/openPMD*: No such file or directory
ordyna35@laser041:/data/home/ordyna35$ ldd $HOME/.local/lib/python3.6/site-packages/openPMD.cpython-36m-x86_64-linux-gnu.so
ldd: /home/ordyna35/.local/lib/python3.6/site-packages/openPMD.cpython-36m-x86_64-linux-gnu.so: No such file or directory
ordyna35@laser041:/data/home/ordyna35$ python --version
Python 3.6.2
ordyna35@laser041:/data/home/ordyna35$ which python
/opt/pkg/devel/python/3.6.2/bin/python
ordyna35@laser041:/data/home/ordyna35$ python -c "import openPMD"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ImportError: libhdf5.so.10: cannot open shared object file: No such file or directory

So this doesn't work any more.

ordyna35@laser041:/data/home/ordyna35$ python -c 'import openPMD; openPMD.Series("/bigdata/hplsim/scratch/garten70/PIConGPU/159_FixPhaseSpaceMomentumMetaData/simOutput/h5/simData_%T.h5", openPMD.Acces_Type.read_only)'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ImportError: libhdf5.so.10: cannot open shared object file: No such file or directory

I'm using only python3 notebooks.

!ldd /home/ordyna35/.local/lib/python3.6/site-packages/openPMD.cpython-36m-x86_64-linux-gnu.so

results in

ldd: /home/ordyna35/.local/lib/python3.6/site-packages/openPMD.cpython-36m-x86_64-linux-gnu.so: No such file or directory

python import openPMD results in

---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-2-129ddf26a4fb> in <module>()
----> 1 import openPMD

ImportError: libhdf5.so.10: cannot open shared object file: No such file or directory
ax3l commented 5 years ago

@pordyna ok, so that's the system-wide openPMD-api install that someone placed there that nags us. Tomorrow, just do the following to install your own, updated openPMD-api it which will mask that old install away.

builds from source

pip install -U --user openpmd-api



I forgot to recommend `-U` initially, which will force an upgrade.

Do the [same commands](https://github.com/openPMD/openPMD-api/issues/390#issuecomment-438221270) work now?

If so, your Jupyter will as well. If not, I am in the lab tomorrow and can assist you.
pordyna commented 5 years ago

@ax3l , so I installed it once again and it imports correctly now. Though I still can't access those simulation data through the API. h5py works just fine. That's how it looks in a notebook:

import openPMD
path = "/bigdata/hplsim/scratch/garten70/PIConGPU/159_FixPhaseSpaceMomentumMetaData/simOutput/h5/simData_%T.h5"
series = openPMD.Series(path, openPMD.Access_Type.read_only)
---------------------------------------------------------------------------

RuntimeError                              Traceback (most recent call last)

<ipython-input-3-a3400043c2db> in <module>()
----> 1 series = openPMD.Series(path, openPMD.Access_Type.read_only)

RuntimeError: Failed to open HDF5 file /bigdata/hplsim/scratch/garten70/PIConGPU/159_FixPhaseSpaceMomentumMetaData/simOutput/h5/simData_5000.h5
path2 = "/bigdata/hplsim/scratch/garten70/PIConGPU/159_FixPhaseSpaceMomentumMetaData/simOutput/h5/simData_5000.h5"
series2 = openPMD.Series(path2, openPMD.Access_Type.read_only)
---------------------------------------------------------------------------

RuntimeError                              Traceback (most recent call last)

<ipython-input-6-45f0b5479647> in <module>()
----> 1 series2 = openPMD.Series(path2, openPMD.Access_Type.read_only)

RuntimeError: Failed to open HDF5 file /bigdata/hplsim/scratch/garten70/PIConGPU/159_FixPhaseSpaceMomentumMetaData/simOutput/h5/simData_5000.h5
import h5py
/opt/pkg/devel/python/3.6.2/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
file = h5py.File(path2)
list(file['/'])
['data', 'header']
list(file['/data/5000/'])
['fields', 'particles', 'picongpu']

Executing it in a terminal gives some more info about the bug:

 python -c 'import openPMD; openPMD.Series("/bigdata/hplsim/scratch/garten70/PIConGPU/159_FixPhaseSpaceMomentumMetaData/simOutput/h5/simData_%T.h5", openPMD.Access_Type.read_only)'

HDF5-DIAG: Error detected in HDF5 (1.8.14) thread 0:
  #000: H5F.c line 604 in H5Fopen(): unable to open file
    major: File accessibilty
    minor: Unable to open file
  #001: H5Fint.c line 990 in H5F_open(): unable to open file: time = Wed Nov 14 12:14:43 2018
, name = '/bigdata/hplsim/scratch/garten70/PIConGPU/159_FixPhaseSpaceMomentumMetaData/simOutput/h5/simData_5000.h5', tent_flags = 1
    major: File accessibilty
    minor: Unable to open file
  #002: H5FD.c line 992 in H5FD_open(): open failed
    major: Virtual File Layer
    minor: Unable to initialize object
  #003: H5FDsec2.c line 343 in H5FD_sec2_open(): unable to open file: name = '/bigdata/hplsim/scratch/garten70/PIConGPU/159_FixPhaseSpaceMomentumMetaData/simOutput/h5/simData_5000.h5', errno = 13, error message = 'Permission denied', flags = 1, o_flags = 2
    major: File accessibilty
    minor: Unable to open file
Traceback (most recent call last):
  File "<string>", line 1, in <module>
RuntimeError: Failed to open HDF5 file /bigdata/hplsim/scratch/garten70/PIConGPU/159_FixPhaseSpaceMomentumMetaData/simOutput/h5/simData_5000.h5
pordyna commented 5 years ago

I noticed that h5py is using a different version of hdf5: 1.8.18.

import h5py
print(h5py.version.hdf5_version)
1.8.18

So I tried loading the 1.8.18 module instead of 1.8.14 and installing the API once again, but it doesn't
seem to be available on hypnos.

module load hdf5/1.8.18
hdf5(9):ERROR:105: Unable to locate a modulefile for 'hdf5/1.8.18'
ax3l commented 5 years ago

Oh, glad the install worked! I am scratching my head about the "permission denied" and will investigate with @n01r and come back to you.

ax3l commented 5 years ago

Lol, the problem is, that for some weird reasons, you need write access to the files, even in read-only mode. Can you just copy the dir for now? I'll open a bug report.

pordyna commented 5 years ago

I will just use h5py for now. Thanks for your help.