gmegan / specification

OpenSHMEM Application Programming Interface
http://www.openshmem.org
1 stars 0 forks source link

PE and ADDR accessible #36

Open naveen-rn opened 6 years ago

naveen-rn commented 6 years ago

@jdinan had a question during todays threads WG which was related to Teams.

Do we have any changes to the shmem_pe_accessible() or shmem_addr_accessible routines? Do we need a separate team specific variants of these routines?

anshumang commented 6 years ago

It could be possible that a given PE is accessible for RMA but not for AMO from a PE. Would these APIs (or a variant) help there?

nspark commented 6 years ago

Do we have any changes to the shmem_pe_accessible() or shmem_addr_accessible routines? Do we need a separate team specific variants of these routines?

I don't think we need this yet. Presuming we nail down teams + contexts, a follow-on feature for 1.6 will be team-based heaps (e.g., symmetric, persistent). Then, we'll need a team-based shmem_addr_accessible. I don't personally use MPMD SHMEM programs, so I don't have a use for shmem_pe_accessible.

It could be possible that a given PE is accessible for RMA but not for AMO from a PE.

Can you explain more, @anshumang?

anshumang commented 6 years ago

@nspark If there was a way for any PE to query what operations to another PE are supported, it could be useful. Is there already a way to do that?

jdinan commented 6 years ago

As defined, shmem_pe_accessible sounds like a team membership query routine, which is probably good to have. I suspect shmem_pe_accessible may have been intended to answer a question a bit more like "Is MPI rank X accessible via SHMEM and does it have SHMEM PE id X?"

nspark commented 6 years ago

As Jim said, I think shmem_pe_accessible and shmem_addr_accessible are primarily there for hybrid MPI + SHMEM programs or MPMD SHMEM programs. Neither is a common use-pattern for me, especially not MPMD programs.

@anshumang I think my concern about some PEs being RMA-accessible, but not AMO-accessible is that I don't really know how to write (portable) programs in that model. RMA and AMOs are fundamental to SHMEM programming. I don't think we (or anyone) could really write meaningful SHMEM programs with only one or the other.

anshumang commented 6 years ago

@naveen-rn I agree with your concern that having RMA support only between two PEs and no AMO support is a problem for code portability. My assumption was that RMA support is always available even if AMO support was not available so that a version of the program not using AMO would be supported. As an example, peer-to-peer accessible GPUs connected by PCI-E does not support atomic operation to memory allocated on the peer GPU. I think there are some important programs that would still work and benefit from the ability to do RMA to peer GPUs. This is somewhat related to #231 if an OpenSHMEM implementation can officially support only a subset of APIs.

gmegan commented 6 years ago

For a team membership and communication ability query, one could use translate plus get_config. Translate will return -1 if the queried PE is in the source team but not the dest team. The configuration spec will be changing over time, so users can write query functions to find out what they need based on their config usage.

// Return PE number to use in barrier on team T or -1 if this isn't possible
int team_barrier_access_id(int global_pe, shmem_team_t T) {
   int team_id = shmem_team_translate_pe(SHMEM_TEAM_WORLD, global_pe, T);
   if (team_id < 0) return -1;
   shmem_team_config_t conf;
   shmem_team_get_config(T, &conf);
   if (conf.disable_collectives != 0) return -1;
   return team_id;
}

// Return PE number to use in rma access using context ctx or -1 if this isn't possible
int team_rma_access_id(int global_pe, shmem_ctx_t ctx) {
   shmem_team_t T;
   shmem_ctx_get_team(ctx, &T);
   return shmem_team_translate_pe(SHMEM_TEAM_WORLD, global_pe, T);
}