esmf-org / esmf

The Earth System Modeling Framework (ESMF) is a suite of software tools for developing high-performance, multi-component Earth science modeling applications.
https://earthsystemmodeling.org/
Other
156 stars 75 forks source link

Allow setting ESMF_Finalize `endflag` in ESMPy and ESMC/I interfaces #234

Closed program-- closed 6 months ago

program-- commented 6 months ago

This PR adds the capability to set the endflag parameter in ESMP_Finalize to prevent MPI from being finalized at ESMF finalization.

Support issue for this PR is here: esmf-support#427

The code path followed to support this:

ESMP_Finalize
-> ESMC_Finalize
-> ESMCI_Finalize
-> f_esmf_frameworkfinalize_
-> ESMF_Finalize

Additions

TODO

The only todo is to validate that unit tests pass, and that this parameter works as intended.

billsacks commented 6 months ago

The only todo is to validate that unit tests pass, and that this parameter works as intended.

Please let us know when you have validated that this parameter works as intended - especially that it indeed solves your issue. Thanks again!

program-- commented 6 months ago

@billsacks @oehmke I've refactored this design slightly to instead propagate ESMC_End_Flag at the Python/C/C++ layers (as suggested by @PhilMiller), instead of a boolean, so this should preserve the details found at the Fortran layer. Still haven't gotten a chance to fully test this out, but ESMF compiles correctly on my end with these changes, at least.

program-- commented 6 months ago

@oehmke Awesome, glad to hear! I'll finish up those small changes today. I've also tested with a small toy program so far that ensures the C API works as expected (not exactly a unit test, but close?)

C Example Program ```c #include #include #include int state_ = 0; extern int PMPI_Finalize(); // intercept MPI calls to output int MPI_Finalize() { printf("\nMPI_Finalize called\n"); PMPI_Finalize(); state_ = 1; return 0; } int MPI_Abort(int comm, int errorcode) { printf("\nMPI_Abort called\n"); state_ = 10; return 0; } #include #include "ESMC.h" int main(int argc, const char* argv[]) { if (argc < 2) { printf("Expected one of: keep_mpi, abort, normal\n"); return 0; } MPI_Init(NULL, NULL); ESMC_End_Flag end = ESMC_END_NORMAL; if (strcmp(argv[1], "keep_mpi") == 0) { end = ESMC_END_KEEPMPI; } else if (strcmp(argv[1], "abort") == 0) { end = ESMC_END_ABORT; } // doesn't actually check for normal int rc; ESMC_Initialize(&rc, ESMC_ArgLast); ESMC_FinalizeWithFlag(end); switch(end) { case ESMC_END_KEEPMPI: printf("Called with ESMC_END_KEEPMPI\n"); assert(state_ == 0); MPI_Finalize(); break; case ESMC_END_NORMAL: printf("Called with ESMC_END_NORMAL\n"); assert(state_ == 1); break; case ESMC_END_ABORT: printf("Called with ESMC_END_ABORT\n"); assert(state_ == 10); break; } return 0; } ```

In this case, I built with clang, gfortran, and (intel) mpich.

Compiling this to test_esmf_endflag provides the following outputs:

# normal
./test_esmf_endflag normal
#> MPI_Finalize called
#> Called with ESMC_END_NORMAL

# keep_mpi
./test_esmf_endflag keep_mpi
#> Called with ESMC_END_KEEPMPI
#> MPI_Finalized called

# abort
./test_esmf_endflag abort
#> MPI_Abort called
#> Called with ESMC_END_ABORT

I will verify with Python as well, but I expect I'll see the same results 🙂 Thanks again for the review and suggestions!

EDIT: still trying to get Python to work, having some issues on my end that I think are related to mpi4py or how I'm compiling... going to set this PR as ready for review in the meantime

PhilMiller commented 6 months ago

Thank you both for working with us on this. It will help us a lot in integrating ESMF functionality in the NWM NextGen system.

Can I ask when you think this will make it into a release, once it's merged?

oehmke commented 6 months ago

Thanks to both of you for your careful work on this!

We are planning on having our next ESMF release (8.7) in early summer and this should be included in that. However, once this is merged and has gone through our nightly tests, then I can make you a beta snapshot tag to use, if that would be helpful.

On Mar 25, 2024, at 12:49 AM, Phil Miller - NOAA @.***> wrote:

Thank you both for working with us on this. It will help us a lot in integrating ESMF functionality in the NWM NextGen system.

Can I ask when you think this will make it into a release, once it's merged?

— Reply to this email directly, view it on GitHub https://github.com/esmf-org/esmf/pull/234#issuecomment-2017334890, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE6A7U4ZOXFOJ745RND7ML3YZ7CIHAVCNFSM6AAAAABFAGWNXOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMJXGMZTIOBZGA. You are receiving this because you were mentioned.

PhilMiller commented 6 months ago

Ok, that sounds good, thank you. I think the beta tag would potentially be useful.

jduckerOWP commented 3 months ago

@oehmke can you confirm to the community here the earliest version of ESMF that this PR has been merged to? Thank you all very much for your efforts on this front, it's been a great help to the NOAA-OWP community.

oehmke commented 3 months ago

It’s great to hear that it’s been helpful! The earliest tag that contains that PR is v8.7.0b06. It’ll also be out in ESMF 8.7 which is expected late summer.

On Jun 12, 2024, at 1:48 PM, Jason Ducker @.***> wrote:

@oehmke https://github.com/oehmke can you confirm to the community here the earliest version of ESMF that this PR has been merged to? Thank you all very much for your efforts on this front, it's been a great help to the NOAA-OWP community.

— Reply to this email directly, view it on GitHub https://github.com/esmf-org/esmf/pull/234#issuecomment-2163783422, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE6A7UYPOT3PV6GEGSE7JVLZHCQ2FAVCNFSM6AAAAABFAGWNXOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRTG44DGNBSGI. You are receiving this because you were mentioned.