CP013: Differences in Affinity and Context papers

Differences in the intersection of papers Affinity - D0796r2 and Context - P0737r0.

[x] std::thread specific resource: Affinity defines execution_resource which is implied to be an execution resource that executes std::thread. Context defines thread_execution_resource_t which is explicit about executing std::thread. Question: Should we lay the foundation for non-std::thread execution resources (e.g, GPU) by embedding the name thread in the initially proposed execution resource?
[x] Affinity execution_resource is moveable and copyable, Context thread_execution_resource_t is neither moveable or copyable. Question: Should an execution resource be a PIMPL (pointer to implementation) value type or an immutable reference? Creating and managing vectors of execution resources requires PIMPL .
[x] Affinity has std::vector<resource> execution_resource::resources() const noexcept; and Context has const thread_execution_resource_t & thread_execution_resource_t::partition( size_t i ) const noexcept ; . Related to the PIMPL question.
[x] Affinity has std::vector<execution_resource> this_system::resources() noexcept; and Context has thread_execution_resource_t program_thread_execution_resource ; . Question: Should there be a root execution resource that is a handle to the union of all individual execution resources? A root execution resource, by definition, has a vector of nested resources which is equivalent to the Affinity proposal's root vector of resources.
[x] Affinity execution_context has name(). Is this for the type of resource, denote the topological identity of the resource, or both? Context avoided the design point, for the initial proposal.
[x] Affinity can_place_* methods open the design point of placement without addressing how to query if placement has occurred or how placement can be performed. Context avoided the design point in the initial proposal.
[ ] Affinity execution_context is constructed from an execution_resource without additional properties, implying 1-to-1 correspondence. Context execution context (concept) construction is undefined.

The idea in P0796 was to just have a single execution resource type which could be used for std::thread as well as non-std::threads to keep things simple with the idea that there may need be further resource types in the future.
I believe we have resolved points 2 & 3 in that we opted for an approach where we make execution resources and memory resources iterable (see https://github.com/codeplaysoftware/standards-proposals/issues/50), and have each execution or memory resource be a PIMPL. We decided to have the resources be copyable and moveable due to desire to use them in high-level algorithms to sort of search for resources.
Yes, since we opted for the approach of making execution resources and memory resources iterable, I think we do need to have a root resource which represents the entire system, which can be iterated over for the first level of resources.
The intention of name was simply to provide some information about the resource, however, it's not intended to be used for anything other than logging information.
In earlier versions of the paper we just covered the capabilities without diving into the specifics of how to perform execution or memory binding, the current paper now has more details of how this is done. Though with moving to separate topologies for execution resources and memory resources, we will likely drop can_place_agents and can_place_memory as they will be implied.
The idea in P0796 was to allow users to create an execution context from any point in the topology, to give control over the granularity available to a context.

@hcedwar wrote:

Creating and managing vectors of execution resources requires PIMPL .

Creating and managing simple data structures of execution resources seems like a thing people might like to do. For example, I might like to build a "distributed" data structure, keyed on the NUMA domain equivalent of "MPI process rank." I need some way to know what resources I'm using for different parts of the data structure.

@hcedwar wrote:

Affinity has std::vector<resource> execution_resource::resources() const noexcept; and Context has const thread_execution_resource_t & thread_execution_resource_t::partition( size_t i ) const noexcept;. Related to the PIMPL question.

I would prefer an array / span / range of resources, for the following reasons:

Users can use standard C++ iteration idioms.
If a resource is a lightweight handle ("PIMPL"), then it serves as its own identifier.
Referring to a resource by an index suggests to users that the index won't change. This makes it hard ever to accommodate dynamic hardware resources. (This is a problem with MPI fault-tolerance proposals as well; process ranks are integer indices, and many MPI codes bake in the assumption that those indices never change.)

Question: Should we lay the foundation for non-std::thread execution resources (e.g, GPU) by embedding the name thread in the initially proposed execution resource?

I'm trying to imagine what it would be like for new C++ programmers in 2020 who start using executors. std::thread itself may confuse them. "Aren't I doing thread parallelism?", they ask. If we start putting the name "thread" everywhere, it may confuse users more.

Could we just add a static resource method to std::thread that returns the corresponding execution resource? It seems like it would help reduce confusion, to tie the resource syntactically to the class.

@hcedwar wrote:

Affinity execution_context has name(). Is this for the type of resource, denote the topological identity of the resource, or both? Context avoided the design point, for the initial proposal.

To me, the main point of name() is to reproduce HWLOC output. Names shouldn't have to be unique or meaningful.

@hcedwar wrote:

Affinity execution_context is constructed from an execution_resource without additional properties, implying 1-to-1 correspondence.

It seems like execution_context is an example of an Execution Context. Thus, there could be other things that are Execution Contents, but have a different interface, e.g., take parameters governing thread pool behavior. Is that right?
It's possible to create multiple contexts from the same resource, at least syntactically.

@mhoemmen wrote:

Creating and managing simple data structures of execution resources seems like a thing people might like to do. For example, I might like to build a "distributed" data structure, keyed on the NUMA domain equivalent of "MPI process rank." I need some way to know what resources I'm using for different parts of the data structure.

For this use case do you think it would be nessesary for the execution_resource to expose a unique rank which reflects it's position within the parent execution_resource? For example if you were to have a package execution_resource, with 4 NUMA execution_resources within it, each of those resources would have the ranks [0, 3).

@mhoemmen wrote:

I would prefer an array / span / range of resources, for the following reasons:

Users can use standard C++ iteration idioms.

If a resource is a lightweight handle ("PIMPL"), then it serves as its own identifier.

Referring to a resource by an index suggests to users that the index won't change. This makes it hard ever to accommodate dynamic hardware resources. (This is a problem with MPI fault-tolerance proposals as well; process ranks are integer indices, and many MPI codes bake in the assumption that those indices never change.)

I agree, by having the execution_resource follow standard C++ iterator idioms, then we have a great deal of flexibility when it comes to how users can use them. In an earlier call we decided that the best approach to this would be to have a collection of execution_resources represented by making the parent execution_resource itself iterable. So an execution_resource itself is a range type over it's member execution_resources. You also make a good point about supporting dynamic hardware resources, by making the execution_resource an opaque type the implementation can handle dyanmic changes to the hardware resources within the implementation without the user having to be directly concerned with it.

@mhoemmen wrote:

I'm trying to imagine what it would be like for new C++ programmers in 2020 who start using executors. std::thread itself may confuse them. "Aren't I doing thread parallelism?", they ask. If we start putting the name "thread" everywhere, it may confuse users more.

The way I see the execution_resource is that it's effectively a ploymorphic execution resource type which can represent many different types of execution resources including threads, cores, sockets, GPU threads, fibres, etc. I feel like there should be a way to identify an execution_resource as a thread resource as it iwll be a very common type of execution resource, but we need to make this extensible to other execution resource types as well.

@mhoemmen wrote:

Could we just add a static resource method to std::thread that returns the corresponding execution resource? It seems like it would help reduce confusion, to tie the resource syntactically to the class.

In the current revision we have a this_thread::resource free function which returns the execution_resource of the current thread of execution, is this what you had in mind?

@mhoemmen wrote:

To me, the main point of name() is to reproduce HWLOC output. Names shouldn't have to be unique or meaningful.

That's right, atm, name is effectively just to represent the HWLOC type, though I expect that in a later design name will simply be for display purposes and we will have a more consistent way of identifying the type of an execution_resource.

@mhoemmen wrote:

It seems like execution_context is an example of an Execution Context. Thus, there could be other things that are Execution Contents, but have a different interface, e.g., take parameters governing thread pool behavior. Is that right?

We had a discussion in a separate issue about what we want execution_context to be (https://github.com/codeplaysoftware/standards-proposals/issues/56). To summarise it could be the execution context, which serves as a polymorphic execution context over any kind of execution context or it could be simply a concept for an execution context type which provides that execution context the ability to be constructed from the execution_resources of the system topology. We decided that we should aim for the latter, as the former requires us to define the global requirements for all potential execution context types and also doesn't support interoperating with existing execution context types such as a static_thread_pool. From your comment it sounds like the former option is also what you had in mind, is that right?

@mhoemmen wrote:

It's possible to create multiple contexts from the same resource, at least syntactically.

Yes that's right, you could potentially create any number of execution_contexts from the same execution_resources however you would have to be aware that those execution_contexts would be contending for the same resources.

On yesterday's heterogeneous C++ telecom we decided:

Answering the first point, the execution_resource should be a generic execution resource type that isn't associated with any particular type of resource, however we should introduce someway of identifying what kind of resource a particular execution_resource is. A runtime approach would be favourable over a compile-time approach, firstly as many low-level APIs which provide access to a system's topology such as Hwloc, HSA and OpenCL and runtime discoverable so a compile-time interface would not be suitable for expressing this, and secondly because having a compile-time interface would mean introducing a large number of types, which would reduce or complciate the ability to store resources generically. I have created a separate issue for continuing the dicussion of this - #66.
Answering the second point, the execution_reosurce should remain copyable and moveable so that it can be used within std alogrithms, but it should be an opaque type with reference counting semantics. I have opened an issue for making this change - #67.

In relation to the other points that were not discussed:

Answering the third point, we have previously decided to make execution_resource iterable, we have an issue open for this - #50.
Answering the fourth point, due to the decision to make the execution_resource iterable, it is now necessary to have a single root execution_resource that would be returned from this_system::get_resource. This would be included in the change for #50.
Answering the fifth point, originally name was just introduced to give the ability to print the topology, though I would see this likely being replaced or extended by more concrete queries in the future. I think this will also be affected by whatever we decide in #66.
Answering the sixth point, the issue of whether to have the can_place_* members goes away with the change to separate out the execution_resources and the memory_resources. We have an issue open for this - #41.
Answering the seventh point, I would agree that this interface needs to be extended, atm it does not provide much flexibility in the way execution_resources are mapped to an execution_context, introducing properties to this would perhaps be a good way to do this. We also have an issue open for allowing an execution_context to map to a number of execution_resources - #51.

codeplaysoftware / standards-proposals

CP013: Differences in Affinity and Context papers #40