codeplaysoftware / standards-proposals

Repository for publicly sharing proposals in various standards groups
Apache License 2.0
27 stars 17 forks source link

CP013: Differences in Affinity and Context papers #40

Open hcedwar opened 6 years ago

hcedwar commented 6 years ago

Differences in the intersection of papers Affinity - D0796r2 and Context - P0737r0.

AerialMantis commented 6 years ago
mhoemmen commented 6 years ago

@hcedwar wrote:

Creating and managing vectors of execution resources requires PIMPL .

Creating and managing simple data structures of execution resources seems like a thing people might like to do. For example, I might like to build a "distributed" data structure, keyed on the NUMA domain equivalent of "MPI process rank." I need some way to know what resources I'm using for different parts of the data structure.

mhoemmen commented 6 years ago

@hcedwar wrote:

Affinity has std::vector<resource> execution_resource::resources() const noexcept; and Context has const thread_execution_resource_t & thread_execution_resource_t::partition( size_t i ) const noexcept;. Related to the PIMPL question.

I would prefer an array / span / range of resources, for the following reasons:

  1. Users can use standard C++ iteration idioms.

  2. If a resource is a lightweight handle ("PIMPL"), then it serves as its own identifier.

  3. Referring to a resource by an index suggests to users that the index won't change. This makes it hard ever to accommodate dynamic hardware resources. (This is a problem with MPI fault-tolerance proposals as well; process ranks are integer indices, and many MPI codes bake in the assumption that those indices never change.)

mhoemmen commented 6 years ago

Question: Should we lay the foundation for non-std::thread execution resources (e.g, GPU) by embedding the name thread in the initially proposed execution resource?

I'm trying to imagine what it would be like for new C++ programmers in 2020 who start using executors. std::thread itself may confuse them. "Aren't I doing thread parallelism?", they ask. If we start putting the name "thread" everywhere, it may confuse users more.

Could we just add a static resource method to std::thread that returns the corresponding execution resource? It seems like it would help reduce confusion, to tie the resource syntactically to the class.

mhoemmen commented 6 years ago

@hcedwar wrote:

Affinity execution_context has name(). Is this for the type of resource, denote the topological identity of the resource, or both? Context avoided the design point, for the initial proposal.

To me, the main point of name() is to reproduce HWLOC output. Names shouldn't have to be unique or meaningful.

mhoemmen commented 6 years ago

@hcedwar wrote:

Affinity execution_context is constructed from an execution_resource without additional properties, implying 1-to-1 correspondence.

  1. It seems like execution_context is an example of an Execution Context. Thus, there could be other things that are Execution Contents, but have a different interface, e.g., take parameters governing thread pool behavior. Is that right?

  2. It's possible to create multiple contexts from the same resource, at least syntactically.

AerialMantis commented 6 years ago

@mhoemmen wrote:

Creating and managing simple data structures of execution resources seems like a thing people might like to do. For example, I might like to build a "distributed" data structure, keyed on the NUMA domain equivalent of "MPI process rank." I need some way to know what resources I'm using for different parts of the data structure.

For this use case do you think it would be nessesary for the execution_resource to expose a unique rank which reflects it's position within the parent execution_resource? For example if you were to have a package execution_resource, with 4 NUMA execution_resources within it, each of those resources would have the ranks [0, 3).

@mhoemmen wrote:

I would prefer an array / span / range of resources, for the following reasons:

  1. Users can use standard C++ iteration idioms.
  2. If a resource is a lightweight handle ("PIMPL"), then it serves as its own identifier.
  3. Referring to a resource by an index suggests to users that the index won't change. This makes it hard ever to accommodate dynamic hardware resources. (This is a problem with MPI fault-tolerance proposals as well; process ranks are integer indices, and many MPI codes bake in the assumption that those indices never change.)

I agree, by having the execution_resource follow standard C++ iterator idioms, then we have a great deal of flexibility when it comes to how users can use them. In an earlier call we decided that the best approach to this would be to have a collection of execution_resources represented by making the parent execution_resource itself iterable. So an execution_resource itself is a range type over it's member execution_resources. You also make a good point about supporting dynamic hardware resources, by making the execution_resource an opaque type the implementation can handle dyanmic changes to the hardware resources within the implementation without the user having to be directly concerned with it.

@mhoemmen wrote:

I'm trying to imagine what it would be like for new C++ programmers in 2020 who start using executors. std::thread itself may confuse them. "Aren't I doing thread parallelism?", they ask. If we start putting the name "thread" everywhere, it may confuse users more.

The way I see the execution_resource is that it's effectively a ploymorphic execution resource type which can represent many different types of execution resources including threads, cores, sockets, GPU threads, fibres, etc. I feel like there should be a way to identify an execution_resource as a thread resource as it iwll be a very common type of execution resource, but we need to make this extensible to other execution resource types as well.

@mhoemmen wrote:

Could we just add a static resource method to std::thread that returns the corresponding execution resource? It seems like it would help reduce confusion, to tie the resource syntactically to the class.

In the current revision we have a this_thread::resource free function which returns the execution_resource of the current thread of execution, is this what you had in mind?

@mhoemmen wrote:

To me, the main point of name() is to reproduce HWLOC output. Names shouldn't have to be unique or meaningful.

That's right, atm, name is effectively just to represent the HWLOC type, though I expect that in a later design name will simply be for display purposes and we will have a more consistent way of identifying the type of an execution_resource.

@mhoemmen wrote:

It seems like execution_context is an example of an Execution Context. Thus, there could be other things that are Execution Contents, but have a different interface, e.g., take parameters governing thread pool behavior. Is that right?

We had a discussion in a separate issue about what we want execution_context to be (https://github.com/codeplaysoftware/standards-proposals/issues/56). To summarise it could be the execution context, which serves as a polymorphic execution context over any kind of execution context or it could be simply a concept for an execution context type which provides that execution context the ability to be constructed from the execution_resources of the system topology. We decided that we should aim for the latter, as the former requires us to define the global requirements for all potential execution context types and also doesn't support interoperating with existing execution context types such as a static_thread_pool. From your comment it sounds like the former option is also what you had in mind, is that right?

@mhoemmen wrote:

It's possible to create multiple contexts from the same resource, at least syntactically.

Yes that's right, you could potentially create any number of execution_contexts from the same execution_resources however you would have to be aware that those execution_contexts would be contending for the same resources.

AerialMantis commented 6 years ago

On yesterday's heterogeneous C++ telecom we decided:

In relation to the other points that were not discussed: