parthenon-hpc-lab / parthenon

Parthenon AMR infrastructure
https://parthenon-hpc-lab.github.io/parthenon/
Other
109 stars 33 forks source link

ADIOS I/O #946

Open BenWibking opened 11 months ago

BenWibking commented 11 months ago

ORNL has optimized ADIOS2 for its systems and vice versa. In particular, Frontier is optimized for file-per-process I/O and ADIOS2 can do this with a single configuration variable change (with its BP5 backend: https://adios2.readthedocs.io/en/v2.9.1/engines/engines.html#bp5).

See also: https://www.exascaleproject.org/wp-content/uploads/2021/02/ADIOS_tutorial_ECP_AHM_Apr2021_full.pdf

With the BP5 backend, it is also significantly faster than either HDF5 and binary MPI I/O (e.g., https://www.scd.stfc.ac.uk/SiteAssets/Pages/CIUK-2021-Poster-Competition/POSTER_Shrey_Bhardwaj.pdf)

ADIOS also supports directly passing GPU buffers to its APIs, while HDF5 does not: https://adios2.readthedocs.io/en/v2.9.1/advanced/gpu_aware.html

Yurlungur commented 11 months ago

How portable is ADIOS as far as data analysis tooling? Is it easy to install, e.g., a Python reader?

BenWibking commented 11 months ago

I don't think it's as easy to install as pip install h5py, unfortunately. It does have a Spack package, and it seems to usually work.

On the other hand, it does come with its own Python API included: https://adios2.readthedocs.io/en/v2.9.1/api_high/api_high.html#python-high-level-api

Yurlungur commented 11 months ago

Hmm my experience with spack for python packages is very poor. If you use spack for python, you better throw away your entire other python stack and use spack for everything. If you let things mix, it's painful.

Though if the setup.py build/install process works ok it. might be fine.

BenWibking commented 11 months ago

And Paraview and VisIt support with an XML metadata file (not sure if it's identical to the XDMF metadata format, but it looks very similar): https://adios2.readthedocs.io/en/v2.9.1/ecosystem/visualization.html

BenWibking commented 11 months ago

Hmm my experience with spack for python packages is very poor. If you use spack for python, you better throw away your entire other python stack and use spack for everything. If you let things mix, it's painful.

Though if the setup.py build/install process works ok it. might be fine.

Yeah, I've experienced this...

For a different can of worms, there's a Conda package: https://adios2.readthedocs.io/en/v2.9.1/setting_up/setting_up.html#conda

BenWibking commented 11 months ago

The main motivation for this is the extremely poor parallel HDF5 performance we've experienced on Frontier (it just can't write compressed/chunked datasets at scale -- at least, without a ridiculous amount of tuning).

BenWibking commented 11 months ago

This is a nice wrapper around the ADIOS API: https://github.com/openPMD/openPMD-api See examples: https://openpmd-api.readthedocs.io/en/latest/usage/firstwrite.html

BenWibking commented 11 months ago

All of the existing codes that use openPMD use 1 var per level and sparsely fill it (natively supported by ADIOS).

Will that work for Parthenon? We have to convert to a global index space, but I don't immediately see any reasons why we couldn't do that. Then we would have to add custom attributes to store the block metadata.

BenWibking commented 9 months ago

@pgrete It turns out that openPMD does not really support ghost cells (you can write them, but the analysis tools that read openPMD won't mask them out -- see https://github.com/openPMD/openPMD-api/issues/1044).

Is there a way to allocate new MeshBlocks without ghost cells just for the I/O?

forrestglines commented 9 months ago

Is there a way to allocate new MeshBlocks without ghost cells just for the I/O?

I think creating a variable with metadata flags Independent and OneCopy is what you want. However, I'm not sure that keeping that data allocated all the time is what we need. I think your suggestion in another chat to just allocate another view would work.

Ideally, I think we'd like to pass in a Kokkos subview to openPMD that excludes ghost zones. I have no idea how openPMD would treat a subview though.

BenWibking commented 9 months ago

Ideally, I think we'd like to pass in a Kokkos subview to openPMD that excludes ghost zones. I have no idea how openPMD would treat a subview though.

It can handle raw device pointers just like Ascent. The issue is that the buffer needs to stay alive until the write finishes, which is not guaranteed to happen until flush() is called. It's designed to write async, so it will work correctly but kill performance if we flush after each MeshBlock is passed.

BenWibking commented 9 months ago

To illustrate how it works, here's a version that works with my AMReX-based code: https://github.com/quokka-astro/quokka/blob/86e7da7cf6868a41a6f5bd2e8d072e1a2c45963f/src/openPMD.cpp

pgrete commented 9 months ago

We'll likely need some additional logic (e.g., to handle sparse variables), but that logic is all already in place as we fill a contiguous host IO buffer for output already for the hdf5 output (so that can be reused). @BenWibking are you actively working on an OpenPMD frontend for Parthenon? Just checking so that don't simultaneously work on the same thing.

BenWibking commented 9 months ago

No, not actively working on it at the moment. If you're coding it up, I'll let you have it :)

BenWibking commented 9 months ago

@pgrete FYI the yt frontend does NOT work for the ADIOS2 backend (https://github.com/yt-project/yt/issues/4757). If you are working on the Parthenon integration, then I will take on fixing the yt frontend so it works for openPMD/ADIOS2.

Yurlungur commented 9 months ago

I am about to pull through #949 should we coordinate that refactor with the ADIOS work as well?

pgrete commented 9 months ago

ADIOS2/OpenPMD will be a completely new output object, so no need to coordinate/wait.

pgrete commented 9 months ago

@pgrete FYI the yt frontend does NOT work for the ADIOS2 backend (yt-project/yt#4757). If you are working on the Parthenon integration, then I will take on fixing the yt frontend so it works for openPMD/ADIOS2.

Sounds like a plan. I already read the specs and think I should be able to pull this through in a straightforward manner (once my schedule clears up).