openPMD / openPMD-api

:floppy_disk: C++ & Python API for Scientific I/O
https://openpmd-api.readthedocs.io
GNU Lesser General Public License v3.0
139 stars 51 forks source link

How do I specify whether a chunk has ghost zones or not? #1044

Open eschnett opened 3 years ago

eschnett commented 3 years ago

In a simulation, one often stores blocks of data surrounded by a number of ghost zones to allow stencil operations. When writing data to a file, one can either output these ghost zones as well (e.g. for debugging) or not.

Does openPMD make any assumptions about whether chunks have ghost zones? How can I describe whether ghost zones are present?

ax3l commented 3 years ago

We don't make assumptions about this in openPMD or the API. I know that ADIOS2 generally supports to write overlapping boxes (selections).

We didn't need that feature yet since our codes crop the guards away before writing. But if that's of interest to you then adding (ADIOS2) tests in here that write in such a manner and check the read looks legit would definitely be ok :+1:

eschnett commented 3 years ago

I am looking for a way to tell the reader that such ghost zones, if they exist, need to be ignored. When chunks on the same mesh overlap, it's straightforward to find out which parts to ignore. However, if there are meshes with multiple resolutions, then it's not clear which ones to ignore: I want the ghost zones of the finer grids to be ignored, and otherwise want the coarse grids to ignored where they are overlapped by fine grids.

ax3l commented 3 years ago

I am looking for a way to tell the reader that such ghost zones, if they exist, need to be ignored.

You mean on a per-chunk (block) level? I think that's information that the writer would need to specify. From the ADIOS viewpoint at read time, you access an ADIOS variable via a selection (box) and ADIOS will read one of the duplicate values to fill your requested selection (box).

That way you would not notice if some data was written in a duplicate way, you just select what you need. But I guess your workflow as one more step to it? E.g., you query the available chunks and their dimensions first or so and would like to limit their presented extent? I think that's extra information we would need to pass then at write time in some way.

I am sure they had some use-cases that wrote additional data before, should we sync up with them to think this start-to-end through? Are the ghost zones consistently large for a mesh on a given resolution or does the size of the ghost per direction vary for each chunk (block)?

A naïve question: wouldn't it be better not to write duplicate data (output size-wise) and rebuild ghosts after read?

eschnett commented 3 years ago

When the ghost zones live on the same level, then it does not matter which ones are used. However, if there are different levels, then each level consists of a set of boxes (independent of how they are laid out as chunks). Ghost zones might extend outside these boxes, and those need to be ignored.

Maybe it would be best to store an attribute that defines a list of boxes (a "region"...) for each refinement level. If there are no ghost zones, then this region can be reconstructed from the list of all chunks. With ghost zones, such an attribute is necessary.

ax3l commented 3 years ago

That sounds good, I would use a variable though over an attribute since the lengths of boxes can be pretty long, I guess. Or are the ghosts on a level for all boxes the same (per direction)? In that case we can do a simple attribute as well, similar how we indicate grid spacing and grid offsets :)

BenWibking commented 10 months ago

This is also something that would be of interest for our code.

It looks like negative indices are not supported. For internal ghost zones, this is not an issue, but for physical boundaries, it is often of interest to look at their values for debugging. Those need to be marked as outside the normal region to be used for visualization.

BenWibking commented 10 months ago

That sounds good, I would use a variable though over an attribute since the lengths of boxes can be pretty long, I guess. Or are the ghosts on a level for all boxes the same (per direction)? In that case we can do a simple attribute as well, similar how we indicate grid spacing and grid offsets :)

In our code, the number of ghosts on a given level are always the same (for all boxes, and for all directions).

eschnett commented 10 months ago

In my case the number of ghosts can be described via 3 numbers (they can be different in different directions).

BenWibking commented 10 months ago

In my case the number of ghosts can be described via 3 numbers (they can be different in different directions).

That seems simple enough. It would be good to have a standardize attribute in openPMD-standard for this. Then I'd be happy to help add support to the API.

The main use case for me (other than debugging) is that some analysis tools seem to have an implicit assumption that ghosts are present, either due to historical reasons (yt) or because they take advantage of domain decomposition to avoid communication when doing some filter operations (VisIt).

BenWibking commented 10 months ago

@ax3l: should either @eschnett or I open an issue in openPMD-standard to propose an 3-vector attribute for this?

(I don't see anything currently here about ghost zones: https://github.com/openPMD/openPMD-standard/projects/3#card-90757340)