Open naveen-rn opened 6 years ago
In the team hierarchy inside in the implementation, there is fundamental difference from duplicate team and a team resulting from split strided where start, stride, size is equal to parent team and conf is parent team conf.
The current situation without dup is:
#define my_shmem_team_dup(parent, child) \
shmem_team_split(parent, 0, 1, shmem_team_n_pes(parent), NULL, 0, child);
shmem_team_t team[3];
shmem_team_split_strided(SHMEM_TEAM_WORLD, 0, 2, shmem_n_pes() / 2, NULL, 0, team);
my_shmem_team_dup(team, team + 1);
my_shmem_team_dup(team, team + 2);
Resulting hierarchy is:
SHMEM_TEAM_WORLD
|-- team[0]
|-- team[1]
|-- team[2]
Adding dup function gives us:
shmem_team_t team[3];
shmem_team_split_strided(SHMEM_TEAM_WORLD, 0, 2, shmem_n_pes() / 2, NULL, 0, team);
shmem_team_dup(team, team + 1);
shmem_team_dup(team, team + 2);
Resulting hierarchy is:
SHMEM_TEAM_WORLD
|-- team[0]
|-- team[1]
|-- team[2]
The second hierarchy might be more helpful for implementations to manage resources.
This is not a suggestion for the initial Teams proposal. We can add it later as well.
To me, it looks like having a
shmem_team_duplicate(parent_team, new_team, config, ...)
seems to be useful for the implementations as well as users. If I understand correctly, we were expecting the users to create multiple teams with the same PE subset for using each unique team handles in a multithreaded use case. With this duplicate kind of syntax, if possible - implementations could share resources and perform some internal optimizations.