E3SM-Project / scorpio

A high-level Parallel I/O Library for structured grid applications
19 stars 16 forks source link

Add direct reading capability for ADIOS IO type #544

Closed dqwu closed 8 months ago

dqwu commented 10 months ago

This enhancement is primarily authored by @dmitry-ganyushin (Dmitry Ganyushin, ADIOS developer), which enables E3SM runs to directly read restart files in ADIOS BP5 format, eliminating the need for BP to NC file conversion before restart runs.

Furthermore, we have incorporated additional code refinements, improved error handling, and addressed minor bugs in Dmitry's initial patch.

dqwu commented 10 months ago

[Two Known Limitations of ADIOS Read Support] 1) In order to read a distributed array, the decomposition map must precisely match the map utilized for writing that array. This also requires that the restart run employs the same PE layout as the initial run.

For E3SM ne4 F case restart runs, we encounter some errors when attempting to read the restart file mpassi.rst.0001-01-02_00000.nc in ADIOS BP format. This arises from certain distributed arrays being written and subsequently read with differing decomposition maps.

A viable solution is to utilize NetCDF or PnetCDF types for writing and reading this category of restart files.

2) Currently, open-to-append mode is not supported for ADIOS type. If an E3SM restart run is configured to open an output history file in ADIOS BP format and append additional time steps to it, an error will be generated by SCORPIO.

Workaround 1: Adjust the configuration of restart runs to avoid open-to-append mode for specific history files.

Workaround 2: Opt for NetCDF or PnetCDF types for files that will undergo append operations.

dqwu commented 10 months ago

[Known Issues on E3SM/CIME Side] 1) In E3SM restart runs, it's necessary to create dummy XXX.nc files even when XXX.nc.bp files are already available. This workaround is crucial to successfully pass certain file existence checks on the E3SM/CIME side. Ideally, E3SM/CIME should be able to recognize these files in ADIOS BP format.

2) Some MPAS files are written and read with IO types specified in stream files, rather than in env_run.xml. Currently, we're still encountering problems when attempting to set ADIOS IO types in these stream files, resulting in the consistent use of the default PnetCDF type.

dqwu commented 9 months ago

Do we need qhashtbl* files? What is the copyright for this code (does it work with SCORPIO copyright)? Can we replace it with C++ maps?

Feedback from Dmitry:

On the website of this software qdecoder.org there is a statement that this is a BSD-like software. To my understanding we can freely use it. It is just a hash map which is used for missing caching feature in ADIOS. Once caching will be implemented in the new version of ADIOS (there is some activities going on in this direction) we can remove those files.

We can also switch to unordered_map in C++ STL as you have suggested.