openshmem-org / specification

OpenSHMEM Application Programming Interface
http://www.openshmem.org
49 stars 32 forks source link

Add API for querying if OpenSHMEM is currently initialized #457

Closed agrippa closed 1 month ago

agrippa commented 3 years ago

Unless I'm missing it, there is currently no API for querying if an OpenSHMEM implementation has been initialized yet (i.e. whether we are in the part of a program between shmem_init and shmem_finalize). Theoretically, this could be something like:

bool shmem_is_initialized();

In general this seems like a useful API, and I've just run in to a concrete case where I "need" it. I'm writing a library in C++ that uses OpenSHMEM. Stack-allocated objects can collectively allocate and deallocate symmetric objects. If a stack-allocated object is created in main(), C++ will not call its destructor until we exit main() which is likely after the programmer has already called shmem_finalize() and cleaned up the symmetric heap. My library calling shmem_free in the destructor of stack-allocated objects after shmem_finalize then causes the app to crash. I can workaround this in a couple of ways: (1) asking users to artificially introduce a new scope inside of main using {} so that C++ objects go out of scope before shmem_finalize, or (2) adding initialize/finalize routines to my library that in turn call shmem_init/finalize and then track that state myself. I'm currently doing (1) and I'd rather avoid (2) as the user may want to use SHMEM before/after my library is in use.

Question to implementors: is this something difficult to add? Is there any reason why this API doesn't already exist? Is there an equivalent in the spec already that I'm missing?

nspark commented 3 years ago

This seems reasonable and straightforward, and not incompatible with #263 (multiple init/finalize). It seems like having the shmem_is_finalized would be worth adding for completeness. (IIRC, @jeffhammond suggested this a long time ago.)

naveen-rn commented 3 years ago

I suppose, these routines are usable outside the shmem-parallel region (i.e before shmem_init and after shmem_finalize).

nspark commented 3 years ago

Yes, and I think they should be thread safe, independent of whether SHMEM is initialized at any particular thread safety level.

naveen-rn commented 3 years ago

In general, as an independent programming model usage - I don't see any issues with this routine.

But, we might have issues in hybrid programming model usage - for example CAF or UPC using SHMEM as communication layer. These programming models, does an shmem_init() internally during the program startup and any further user created explicit shmem_init() call will be essentially a no-op. But, calling shmem_is_initialized() in this case would result in TRUE even before an explicit shmem_init() call.

nspark commented 3 years ago

Subsequent calls to shmem_init result in undefined behavior, so the case of hybrid SHMEM + CAF/UPC-over-SHMEM could be problematic without such an interface.

Yes, it could be confusing, but I'd argue such a hybrid program would be problematic because an explicit call to shmem_finalize would invalidate UPC operations after finalize but before the program exits.

jeffhammond commented 3 years ago

MPI did a bunch of work on this exact topic a while back. @jdinan is well-qualified to summarize.

jdinan commented 3 years ago

There is no harm in replicating the MPI query functions; I think any design effort in this space will arrive at the same API. That said, there are still issues with only having is_initialized and is_finalized because the SHMEM library doesn't know that it has multiple users of the API. For example, the first user to initialize the library can finalize it while another library is still using it. Reference counted init/finalize deals with this case and I'm not sure we would need is_initialized and is_finalized if we add reference counted init/finalize.

There is another case that reference counted init/finalize doesn't handle, which is two pieces of code that use SHMEM one after the other, resulting in init...finalize, , init...finalize. Historically, MPI does not support reinitialization after the library is finalized (a motivation to have query functions in addition to ref counting if we go this route). We could require SHMEM libraries to support reinitialization, but this is nontrivial to implement/test.

jeffhammond commented 3 years ago

We could require SHMEM libraries to support reinitialization, but this is nontrivial to implement/test.

Is it not the case that the reentrant initialization starts with a fetch-and-increment and the initialization is only performance if the fetched value is zero? The reentrant finalization does the same, for a fetched value of one.

I can't remember what I've written that does this but pthread_once may also be useful here.

I acknowledge that rigorously verifying correctness is, like essentially everything in software, very difficult, but I don't think we need to formally verify OpenSHMEM implementations the way we do computers associated with locomotion and robotics.

jdinan commented 3 years ago

The easy thing to do is to not finalize the OpenSHMEM library in the call to finalize and clean up only atexit. One of the issues with reinitialization is statically initialized variables in the library. These are often not reset in the call to finalize, which can lead to problems.

jdinan commented 1 month ago

Resolved by #512