Sandia-OpenSHMEM / SOS

Sandia OpenSHMEM is an implementation of the OpenSHMEM specification over multiple Networking APIs, including Portals 4, the Open Fabric Interface (OFI), and UCX. Please click on the Wiki tab for help with building and using SOS.
Other
61 stars 53 forks source link

Teams: add SHMEMX_TEAM_NODE #1102

Closed davidozog closed 11 months ago

davidozog commented 11 months ago

This should be handy for supporting multi-NIC and/or topologically-aware collectives.

@wrrobin - perhaps this could be named SHMEMX_TEAM_NODE because it's relying on PMI's notion of a "node" (via shmem_runtime_get_node_rank()). Here is a short summary of where this info comes from across the SOS "runtime-pmi" options:

But SHMEMX_TEAM_HOST is fine with me if that's preferred/simpler.

wrrobin commented 11 months ago

Thanks @davidozog. SHMEMX_TEAM_NODE is better.

davidozog commented 11 months ago

LGTM! Should things like SHMEM_TEAM_HOST_INDEX and shmem_internal_team_host be SHMEMX_TEAM_HOST_INDEX/shmemx_internal_team_host though? From a readability perspective I definitely prefer things as they are, just not sure about the technicalities of using shmem vs shmemx. Are we good to use shmem internally for variable name prefixes to shmemx related objects so long as what is exposed to the user abides by the shmem/shmemx rules?

Thanks @philipmarshall21 - only the user-facing API needs the "X" to indicate it's an extension. The runtime can handle internal symbols however, and as you say, it kinda makes sense to withhold the "X/x" from symbols that aren't visible to users anyway.

davidozog commented 11 months ago

Thanks @davidozog. SHMEMX_TEAM_NODE is better.

I think so too - it's renamed now.