GlobalArrays / ga

Partitioned Global Address Space (PGAS) library for distributed arrays
http://hpc.pnl.gov/globalarrays/
Other
96 stars 38 forks source link

GA_Sync should call ARMCI_Barrier when possible #325

Open jeffhammond opened 1 month ago

jeffhammond commented 1 month ago

GA_Sync is equivalent to ARMCI_Barrier on the world group. One can optimize ARMCI_Barrier better than ARMCI_AllFence + pnga_msg_sync.

We should also add ARMCI_GroupBarrier that is equivalent to ARMCI_GroupFence + pnga_msg_pgroup_sync.

void pnga_sync()
{
  GA_Internal_Threadsafe_Lock();
#ifdef CHECK_MA
  Integer status;
#endif

#ifdef USE_ARMCI_GROUP_FENCE
  if (GA_Default_Proc_Group == -1) {
    int grp_id = (int)GA_Default_Proc_Group;
    ARMCI_GroupFence(&grp_id);
    pnga_msg_sync();
  } else {
    int grp_id = (Integer)GA_Default_Proc_Group;
    pnga_pgroup_sync(grp_id);
  }
#else
  if (GA_Default_Proc_Group == -1) {
    ARMCI_AllFence(); // HERE
    pnga_msg_sync();
    if(GA_fence_set)bzero(fence_array,(int)GAnproc);
    GA_fence_set=0;
  } else {
    Integer grp_id = (Integer)GA_Default_Proc_Group;
    pnga_pgroup_sync(grp_id);
  }
#endif
#ifdef CHECK_MA
  status = MA_verify_allocator_stuff();
#endif
  GA_Internal_Threadsafe_Unlock();
}