openPMD / openPMD-api

:floppy_disk: C++ & Python API for Scientific I/O
https://openpmd-api.readthedocs.io
GNU Lesser General Public License v3.0
138 stars 51 forks source link

Pickle w/o Statics #1458

Closed ax3l closed 2 months ago

ax3l commented 1 year ago

For the first implementation of multi-process (multi-node) Dask, we pickle objects like the Record and RecordComponents + their series.

The series is unpickled into a static function member, to avoid:

This works as a hack until you need to work with two series at a time.

https://github.com/openPMD/openPMD-api/blob/0.15.1/include/openPMD/binding/python/Pickle.hpp#L73-L77

pordyna commented 1 year ago

@ax3l I tried using dask delayed with openpmd-api for parallelization over iterations. I couldn't get it to work, I think, because dask wasn't able to pickle the series object. Would it make sense to add add_pickle to Series and Iteration?

pordyna commented 3 months ago

@ax3l @franzpoeschel This also doesn't seem to work when working with one series at a time but more than one in a kernel instance. I don't understand why, but I'm trying to use dask and iterate over multiple simulations. The result is that the data keeps being loaded from the first series! Deleting series in between or restarting dask workers doesn't help. The only thing that worked for me in Jupyter was restarting the kernel between running the code for different series.

Are you aware of any workaround?

franzpoeschel commented 3 months ago

That's precisely what Axel means above by

This works as a hack until you need to work with two series at a time.

Unpickling e.g. a single RecordComponent does not really work trivially together with our memory model in openPMD since a RecordComponent will become invalid once its Series is deleted, but the Pickle API gives us no way to store the Series anywhere. Ideally, we should change our C++ API to a model where any handle keeps the entire thing alive, this would also solve this issue. This should be possible, but would be a slightly larger change (I actually have PR open with an internal remodeling that might help here). For now, this is what we do:

            // Create a new openPMD Series and keep it alive.
            // This is a big hack for now, but it works for our use
            // case, which is spinning up remote serial read series
            // for DASK.
            static auto series = openPMD::Series(filename, Access::READ_ONLY);

... which leads exactly to the behavior that you see.

I do have an idea though how we could fix this short-term, lemme see