Team creation, context creation, and NO COLLECTIVES teams

gmegan commented 5 years ago

The section at the end of the teams intro about synchronization between team creation events is confusing and convoluted.

The requirements to avoid synchronization between team create events are too esoteric. Right now a team created with NO_COLLECTIVE and 0 contexts does not require sync between team create. But when we add heaps, that will change this requirement. Propose to put a config option NO_RESOURCE that makes a team that is a numeric mapping only and any config options for contexts, collectives, heaps, etc are ignored.

The requirement to sync on ancestor team when teams are created from a NO_COLLECTIVE team parent is problematic. It means there is an induced graph of team inheritance that the user has to care about and that the implementation has to enforce. Overall, the creation of new teams from arbitrary sets of PEs needs as much automatic work from the library as is feasible, otherwise removing pSync and pWrk will just punt the problem into these convoluted synchronization requirements.

The issue is when this sequence of operations is executed:

split team A from team world, where team A = PEs {0-6} , and does not support collectives
split team Z from team world, where team Z = PEs {0-4} , and does not support collectives
split team B from team A, where team B = PEs {0, 2, 4}, and supports collectives
split team C from team A, where team C = PEs {0, 3, 6}, and supports collectives
split team X from team Z, where team X = PEs {2, 3}, and supports collectives

So teams X, B and C both need to do some operations to setup symmetric resources for collectives. This requires using some existing symmetric resources to bootstrap the new symmetric resources. Neither teams A or Z has any symmetric resources to bootstrap team creation, so the only resources to bootstrap new teams are the global symmetric resources.

PE 0 needs the split to team B to finish before the split from team C starts or its shared team create resource could get corrupted by PEs 3 or 6 (team C) while the split to team B is still ongoing.
PE 2 needs the split from team B to finish before the split to team X starts or the symmetric team create resource on PE 2 could get corrupted by PE 3 (team X) while the split to team B is still ongoing.
PE 3 needs the split to team C to finish before the split to team X starts or the symmetric team create resource on PE 3 could get corrupted by PE 2 (team X) while the split to team C is still ongoing.

So, as is, the user is going to end up putting sync_all between every split from a NO_COLLECTIVE team to make sure all global symmetric resources are protected.

naveen-rn commented 5 years ago

Can we say that the root cause of this issue is mixing NO_COLLECTIVE and normal teams in the hierarchy of team creation?

gmegan commented 5 years ago

I think yes, mixing these team types is the root of the problem. But we have cases where we want this, e.g. using multiple 2D splits to make a 3D split.

nspark commented 5 years ago

Right now a team created with NO_COLLECTIVE and 0 contexts does not require sync between team create. But when we add heaps, that will change this requirement.

This is maybe a small point, but even when we add heaps to teams, the heap operations—creation, allocation, deallocation, etc.—will be collective across the team. In that way, I think a team with NO_COLLECTIVE would not be suitable for use with team-based heaps.

nspark commented 5 years ago

This is a good breakdown of the problem, @gmegan. I wonder now whether we should consider a "hybrid" approach with the alternative Sets and Groups proposal from ORNL and @manjugv. My main concern with Sets and Groups was the generality of set creation, and the storage/indexing overheads that might induce (along the lines of color/key-split). I did have a secondary concern of API verbosity, but this issue makes it seem like it might be necessary (or preferred to the complexities here).

As a "hybrid", I'd suggest only providing:

shmem_set_split_{strided, 2D}
`shmem_team_create(shmem_set_t set, ...);
predefined objects SHMEM_{SET, TEAM}_{WORLD, SHARED}

Admittedly, there will be cases in which it will feel somewhat verbose to, for example, extract the set from a team, split the set, then create a child team, but I wonder whether this complexity is worth resolving this issue.

Yes, this would be quite a bit of churn in terms of LaTeX changes, but, would it solve the issues raised here?

naveen-rn commented 5 years ago

@nspark @gmegan It looks like there are two separate problems being discussed in this issue.

Mixing NO_COLLECTIVE and COLLECTIVE team in the hierarchy of teams creation; and
Expecting ordering of teams created from the users using some application level synchronization.

Problem(1) can be resolved by resisting the mix of these two types of teams. Two me this multiple 2D splits to a 3D split seems to be an one-off use case. If required, we should create a separate new API for 3D split.

Problem(2) is an implementation based requirement, since we moved away from user passed psync arrays and depend entirely on internal psync arrays. If implementations have support for 2-sided/active-set based operations internally in their comms layer - there is no need for this synchronization expectation. We might(!) have difficulties when we introduce team-based SHEAP, but still it looks solvable. If implementations use RMA to implement team-creation operations, then it looks like there are no workarounds. I don't see how sets/groups based PE-subset formation can resolve this problem.

gmegan commented 5 years ago

As per ongoing discussion, team creation sync requirements currently look like this:

If all child teams are NO COLLECTIVE, the split operation is local only and no sync is required
If any child team supports collectives, the operation is collective across child team
If a team is the parent argument to split operation multiple times and child teams overlap and support collectives, sync on parent team between split operations is required
teams supporting collectives cannot be created from NO COLLECTIVE teams
NO COLLECTIVE teams cannot create contexts (num_contexts = 0 is only valid config)
parent team can be destroyed before child team, i.e. team creation hierarchy does not impact team destruction ordering

naveen-rn commented 5 years ago

NO COLLECTIVE teams cannot create contexts (num_contexts = 0 is only valid config)

This statement seems bit miss-leading. Based on my understanding from IB perspective,

QP creation needs one-to-one mapping.
To create QP asynchronously without participation from remote process, we could create such an asynch QP - but we can't use them until we get the remote QP number from the target process.
In that case, if the remote PE supports some sort of progress thread to manage this - then we could implement it without the team creation operation being collective.

Correct me if my undertanding is incorrect.

naveen-rn commented 5 years ago

This statement seems bit miss-leading. Based on my understanding from IB perspective,

Followup from the OpenSHMEM WG discussion. Though this seems implementable - based on Manju's statement - it looks there is performance impact in maintaining the progress thread either in the implementation or someway through hardware.

gmegan commented 5 years ago

NO COLLECTIVES teams are removed from the current draft for now. They can be revived later once we can address the issues around how they can be used for p2p under the current teams/contexts model. As the draft stands, it will be straightforward for implementations to add NO COLLECTIVES or other team types as experimental features by expanding the team creation config.

gmegan / specification

Team creation, context creation, and NO COLLECTIVES teams #70