Closed naveen-rn closed 5 years ago
@manjugv ?
Are we looking to create this team inside a thread parallel region?
I think it should be allowed to create a team from the non-main thread.
Does this feature currently supported in the draft write up?
I don't think there's anything in the current draft that precludes this.
Here's a detailed version of the example I raised on today's call discussing this issue: Each PE creates two threads. On all PEs congruent to 0 mod 2, Thread 0 creates such a team. Similarly, on all PEs congruent to 0 mod 3, Thread 1 creates such a team. All then PEs have 0, 1, or 2 teams.
One example of multithreaded team creation that I think should be erroneous:
#pragma omp parallel num_threads(2)
{
shmem_team_t team_mod = SHMEM_TEAM_NULL;
shmem_team_t team_mod = SHMEM_TEAM_NULL;
switch (omp_get_thread_num()) {
case 0:
if (0 == (shmem_my_pe() % 2))
shmem_team_create_strided(SHMEM_TEAM_WORLD, 0, 2, 2, /* size */, &team_mod2);
break;
case 1:
if (0 == (shmem_my_pe() % 3))
shmem_team_create_strided(SHMEM_TEAM_WORLD, 0, 3, 3, /* size */, &team_mod3);
break;
}
/* ...use the teams... */
}
The rationale on why this is erroneous is that for PEs in which 0 == (shmem_my_pe() % 6)
, there is no ordering between Threads 0 and 1 as to which one calls shmem_team_create_strided
first. Thus, the internal pSync
-like structure may not be allocated symmetrically across the whole of team_mod2
and team_mod3
.
One example of a correct implementation would be:
#pragma omp parallel num_threads(2)
{
shmem_team_t team_mod = SHMEM_TEAM_NULL;
shmem_team_t team_mod = SHMEM_TEAM_NULL;
if ((omp_get_thread_num() == 0) && (0 == (shmem_my_pe() % 2)))
shmem_team_create_strided(SHMEM_TEAM_WORLD, 0, 2, 2, /* size */, &team_mod2);
#pragma omp barrier
if ((omp_get_thread_num() == 1) && (0 == (shmem_my_pe() % 3)))
shmem_team_create_strided(SHMEM_TEAM_WORLD, 0, 3, 3, /* size */, &team_mod3);
/* ...use the teams... */
}
In this case, all the threads on all PEs such that 0 != (shmem_my_pe() % 6)
will call shmem_thread_create_strided
at most once, and one the PEs such that 0 == (shmem_my_pe() % 6)
, Thread 0 will always create team_mod2
before Thread 1 creates team_mod3
.
Added PR #41 for this issue, which also resolves Issue #33. See issue #33 for pdf attachment.
Merged PR, closing this issue.
Lets assume we create a team which would be used for collectives.
Are we looking to create this team inside a thread parallel region? Does this feature currently supported in the draft write up?