issue when reading beam from file - Githubissues

Hi-PACE / hipace

Highly efficient Plasma Accelerator Emulation, quasistatic particle-in-cell code

https://hipace.readthedocs.io

Other

51 stars 14 forks source link

issue when reading beam from file #472

Open MaxThevenet opened 3 years ago

MaxThevenet commented 3 years ago

Encountered two issues:

When using `beam.input_file = beam_input/openpmd_000000.h5`

I get error

HDF5-DIAG: Error detected in HDF5 (1.10.7) MPI-process 0:
  #000: H5F.c line 429 in H5Fopen(): unable to open file
    major: File accessibility
    minor: Unable to open file
  #001: H5Fint.c line 1644 in H5F_open(): unable to open file: time = Sat May  1 08:25:19 2021
, name = 'beam_input/0.h5', tent_flags = 0
    major: File accessibility
    minor: Unable to open file
  #002: H5FD.c line 741 in H5FD_open(): open failed
    major: Virtual File Layer
    minor: Unable to initialize object
  #003: H5FDsec2.c line 360 in H5FD__sec2_open(): unable to open file: name = 'beam_input/0.h5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0
    major: File accessibility
    minor: Unable to open file
libc++abi.dylib: terminating with uncaught exception of type openPMD::no_such_file_error: [HDF5] Failed to open HDF5 file beam_input/0.h5
SIGABRT
See Backtrace.0 file for details

When using `beam.input_file = openpmd_000000.h5`

The code prints millions of 0and I had to kill the run. I used this input file to generate the beam, and then tried to re-run the same simulation loading the beam. The beam file is ~2 MB.

SeverinDiederichs commented 3 years ago

The quick solution to this problem is to read it with openpmd_%T.h5. But for user-friendliness, this should be addressed.

MaxThevenet commented 3 years ago

Thanks @SeverinDiederichs. While I agree this can be a workaround, I think passing an arbitrary file name as an input is a natural option for the user. In most cases, this file will be generated from a Python script, or even another code, so it may very well contain no digit, for instance.

@AlexanderSinn could you fix this issue, and make sure that the user can use

beam.input_file = path/to/file/filename.h5

smoothly?

AlexanderSinn commented 3 years ago

So in this case beam.input_file = openpmd.h5 or beam.input_file = openpmd_000000.h5 ? The second option would be redundant with the iteration as other Iterations are named eg. openpmd_000001.h5 . openpmd_%T.h5 specifies all Iterations at once. I Think this is a feature of openPMD-API

MaxThevenet commented 3 years ago

I think it is good to keep the option to specify the iterations as you do, this is useful when e.g. restarting a simulation. But In many cases the user doesn't even have iterations, so we shouldn't rely on that. In this case it would be

beam.input_file = any/path/any_file_name.h5

It is better not to assume a given name for the input file. Could this be another option?

AlexanderSinn commented 3 years ago

I found this in openPMD-Viewer. I think I could put it in Hipace:

# match last occurance of integers and replace with %T wildcards
# examples: data00000100.h5 diag4_00000500.h5 io12.0.bp
#           te42st.1234.yolo.json scan7_run14_data123.h5
file_path = re.sub(r'(\d+)(\.(?!\d).+$)', r'%T\2', first_file_name)

MaxThevenet commented 3 years ago

if that works even if the input file name has no integer, then sure. But let's not make it over-complicated, the option to just specify a file should be quite basic.

AlexanderSinn commented 3 years ago

In the Examples %T is always used https://openpmd-api.readthedocs.io/en/0.13.3/usage/firstread.html I don't think its possible to just put in the file name, unless the file doesn't have numbers in it

AlexanderSinn commented 3 years ago

Now that I looked a bit more into it, the underlying misconception is the following: In openPMD-API Iterations can either be file-based or group-based. When group-based, all Iterations are in the same file which gets written and read by its actual name. When file-based, all Iterations are in different files in the same folder. In the writing script, they are referred to as eg. "beam_%05T.h5" and have file names like beam_00000.h5, beam_00001.h5 …. These have to be read in as beam_%T.h5 for openPMD to know to look for file-based data. Currently the output of Hipace++ is file-based. What you are trying to do, reding in file-based data with the file name (suggesting group-based data), actually works with a warning message in the python version of openPMD-API, however that seems to be broken in C++. In Hipace we could just detect a number at the end of a file input string and change it to %T pretty easily, however if the file is actually group based and just happens to have a number at the end of it, the name would still get changed resulting in an error when reading the file. I think this issue is eighter a wont fix or a solution would need to use try and catch.

AlexanderSinn commented 3 years ago

In case it wasn’t clear, currently the beam path/name is directly parsed to openPMD-API and not changed at all in Hipace. Furthermore, Hipace does not know whether the beam is group- or file-based.

MaxThevenet commented 3 years ago

Update: this should be fixed in the next openPMD release, it was fixed in https://github.com/openPMD/openPMD-api/pull/938. Once it is out, we'll just change the tag of the openPMD release we're using, and we should be good to go. I would suggest to add a CI test similar to https://github.com/ECP-WarpX/WarpX/pull/1937, and just update the names when the openPMD release is out.