PlasmaPy / PlasmaPy-PLEPs

A repository for PlasmaPy Enhancement Proposals
http://docs.plasmapy.org/
Other
8 stars 6 forks source link

Create PLEP on adopting OpenPMD standard? #13

Open namurphy opened 6 years ago

namurphy commented 6 years ago

A PlasmaPy issue was raised recently about adopting the OpenPMD standard for particle-mesh data. I personally think that this would be a great topic for a PLEP, since it is an important design decision and it would also be really helpful to have a design document to refer to during development that clarifies what adopting this standard would entail and how we would go about implementing it. I would be happy to help with drafting this, though at this point I have just a basic knowledge of what this standard is.

Original topic: https://github.com/PlasmaPy/PlasmaPy/issues/167 OpenPMD repositories: https://github.com/openPMD OpenPMD standard: https://github.com/openPMD/openPMD-standard

tulasinandan commented 6 years ago

I like the standard defined by these people. Looks like we do not have a PLEP yet. Maybe we should create one. I'll look into it within the next couple of working days.

ax3l commented 6 years ago

Hi there, we are excited that you consider openPMD!

A lot has been added since we last wrote and I just want to give you a short heads-up.

openPMD 2.0 has incorporated a lot of features and if there weren't a ton of conferences ahead we would finish it very soon. But we will get there. There is also a new module/tool called openPMD-updater that can forward-update existing files with openPMD 1.X with light-weight meta data updates as soon as 2.0.0 is finalized.

For adopting you come at a great time, since we just started to implement a high-level API! openPMD-api has reached alpha state for C++ and has first python bindings as well. (Manual, Install) For HDF5, we are h5py compatible in output and add a nice high-level description that actually understands the openPMD self-describing physics and objects without fiddling with low-level file APIs. Talking of low-level, we support already HDF5 and ADIOS as filesystem backends and plan for NetCDF as well (and even more ;-) ). Also, the same API works if you need scalable I/O for large parallel applications.

That's it so far, feel free to ping us anytime :)

StanczakDominik commented 6 years ago

Heyo! We had a talk about this topic today during our bi-weekly telecon (by the way, would you be interested in participating some time in the future?) as this is relevant to @ritiek's work over in https://github.com/PlasmaPy/PlasmaPy/pull/500.

At the moment, we're kind of wrangling with a few design decisions:

At the same time, I personally think that:

But of course there may be other approaches, for example we could work with you upstream and ensure Cmake compilation does what it's supposed to with pip. I haven't tried, but it theoretically could be feasible.

This discussion seems important so I'll tag everyone from today's telecon : @namurphy @SolarDrew @tulasinandan @ritiek @samurai688 and I guess @lemmatum might be likely to care

ax3l commented 6 years ago

Hi, thank you for your thoughts and explanations!

Regarding the API: it's all in all just a convenient offering to use it and we have ourselves lived well with h5py and native bindings in the last years to write openPMD markup :-) We have a validator script that can just check if what one wrote is correct.

@ritiek I would love your feedback as an issue over in openPMD-api whatever might have caused you pain - I want to hear all of it and see how we can improve it! Install from source must be as easy as cmake .. && make && make install, otherwise we failed. But let us discuss the API install in a separate issue, we are still in alpha and need feedback on things that are unexpectedly troublesome. Contrary to the C++ API (alpha), the Python API is just getting started. Even if you use h5py which is totally reasonable for PlasmaPy as well, we would love your feedback on the Python API if you have the time. Thoughts like "ugh, returning X is quite unpythonic. Why not interface as Y instead." are very welcome.

Generally, we can totally publish binary and source packages to PyPI (draft of a setup.py) as well if need be, e.g. if someone wants to assist us with setting up a multibuild so we can automate the process as we automate our conda releases.

That said, I personally think that you will generally reach a more modern and more stable experience for users with conda since it's a better controlled system for various libraries you want a controlled install and performance with. Also, currently standardized Python wheels (e.g. manylinux/manylinus2010) are extremely ancient in compiler requirements which is caused by some distributions and trickles down to poor PyPI. For openPMD we are agnostic to the distribution channel, publish most things to PyPI and conda and even go the extra mile to support the old compilers in the list. If we can automate the shipping in a controlled, open build environment (for osx, win and linux) and user experience is not horrible with it, we will ship ;-)

That said, don't wait for us if you like h5py and the HDF5 backend is sufficient for your use case! h5py works great for us as well and we are compatible (e.g. in weird things like representation of bools and string attributes - just make sure to run our validator on your data).

by the way, would you be interested in participating some time in the future?

certainly! :)

ritiek commented 6 years ago

@ax3l I am really glad you're taking much efforts to have people adopt these standards which I hope people really do!

Install from source must be as easy as cmake .. && make && make install, otherwise we failed.

Yep, it did build smoothly with no problems when I passed no arguments but there was some trouble when passing -DopenPMD_USE_PYTHON=ON and -DopenPMD_USE_HDF5 to cmake.

It's been a while since I tried to build but AFAIK there was some error regarding `MPI` and `HDF5` which I think [building HDF5 and h5py with MPI support fixed it](https://gist.github.com/kentwait/280aa20e9d8f2c8737b2bec4a49b7c92). I think it would be very nice if it were point to relevant docs for building with MPI support. (Also, it isn't immediately clear what MPI support does, it might be good to have some docs what additional features it brings to OpenPMD-API :) That error regarding `HDF5` and `MPI` went away but then now I face this one. ``` -- HDF5: Using hdf5 compiler wrapper to determine C configuration -- Can NOT find 'adios_config' - set ADIOS_ROOT, ADIOS_DIR or INSTALL_PREFIX, or check your PATH -- Could NOT find ADIOS (missing: ADIOS_LIBRARIES ADIOS_INCLUDE_DIRS) (Required is at least version "1.13.1") CMake Error at CMakeLists.txt:176 (find_package): Could not find a package configuration file provided by "pybind11" (requested version 2.2.1) with any of the following names: pybind11Config.cmake pybind11-config.cmake Add the installation prefix of "pybind11" to CMAKE_PREFIX_PATH or set "pybind11_DIR" to a directory containing one of the above files. If "pybind11" provides a separate development package or SDK, be sure it has been installed. -- Configuring incomplete, errors occurred! See also "/home/ritiek/Downloads/openPMD-api-build/CMakeFiles/CMakeOutput.log". ``` I tried looking up and I think installing [ADIOS](https://github.com/ornladios/ADIOS) might fix it but it was mentioned in under optional I/O dependencies in the README (and/or it could be that I have pybind installed incorrectly) but I didn't really try installing it yet. I guess I'll tinker a bit more and might raise an issue in OpenPMD-API repo with feedback on what could have been better in my perspective. :) --- Also, I tried installing it with spack ``` spack install openpmd-api +python spack load --dependencies openpmd-api ``` but for some reason spack always ended up installing Python 2.7.15 even when I created a symbolic link in `/usr/bin/python` which pointed to my Python 3.6 binary (I wanted it build OpenPMD-API for Python 3.6). Initially it did say it was running Python 3.6 with `spack python` but that changed when I did `spack install openpmd-api +python`. I am not really sure whether this is something to do with OpenPMD-API or spack itself. I haven't used spack before this so I can't say how it works. Do you guys know of a way to specify the Python version? And conda installation really did go smooth, no errors in da way. :D --- That said, I am myself very new to HDF5 and OpenPMD in general, so it may be just something to do with my lack of clear understanding at the moment. :/
ax3l commented 6 years ago

@ritiek excellent, thank you for the details, wonderful!

I would like to respond to you but am afraid to spam your PLEP proposal. can we move it to a bugreport in https://github.com/openPMD/openPMD-api/issues/new/choose ? (Just copy & paste your last comment and link here.)

ax3l commented 4 years ago

Little update: openPMD 1.x is implemented with https://github.com/PlasmaPy/PlasmaPy/pull/500 (closing https://github.com/PlasmaPy/PlasmaPy/issues/167 ) via h5py :rocket: :sparkles:

Adding to what I said earlier: our low-level reference implementation openPMD-api might be useful for any kind of data pre-/post-processing in the future as well (or you could use it for I/O), also it is currently been advanced for staged (in-transit) workflows. It provides now C++11 and Python3 bindings and is shipped via:

Current file backends include: json, HDF5, ADIOS1 and ADIOS2, the latter three support serial as well as MPI-parallel I/O.