codeplaysoftware / standards-proposals

Repository for publicly sharing proposals in various standards groups
Apache License 2.0
27 stars 17 forks source link

CP013: New motivation front matter for P1437 #93

Closed AerialMantis closed 4 years ago

AerialMantis commented 5 years ago

In the last call we discussed the direction to go in for P1437: System topology discovery for heterogeneous & distributed computing, now that it's been split off from P0796. We looked at some of the use-cases for having a low-level affinity interface in C++ and what we would like such an interface to look like. We decided that based on the feedback from Kona we should refocus the motivation and goals of the proposal for a low-level affinity interface in the first revision of P1437.

Some of the benefits of a low-level affinity interface in C++ that we discussed were:

We discussed that having a standardized interface in C++ for querying the topology of a system for its execution resources and the affinity relationships between those resources would be highly beneficial for writing generic code that can target heterogeneous platforms. However, we also recognised that expecting C++ to keep up with the rapidly changing and developing architectures within heterogeneous computing domains and to support their various unique features and capabilities is unrealistic. To this end, we would like to aim instead for C++ to provide a unified layer between future hardware standardization efforts like HMM and executor based programming models such as threads pools, SYCL or Kokkos. This would provide a middle layer for users and library implementors to target in order to write more generic and potentially "performance portable" applications and programming models, whilst also providing hardware vendors with a way to extend the interface to provide support for the more unique features and capabilities of their architectures.

We discussed concerns that the current C++ abstract machine and the language around it are just not sufficient for describing heterogeneous systems. So while expecting the C++ abstract machine to be completely revamped to cover a range of different hardware features and capabilities is unrealistic, there will have to be some new language introduced to allow C++ to describe the system topologies that are being queried. We noted that this is something that is even becoming evident in P0443, the unified executors proposal, where it's proving difficult to express certain properties in the language the C++ abstract machine currently provides.

Closely related to this we discussed the move towards a unified address space in heterogeneous systems via SVM and HMM. We made the point that this move actually makes the case for affinity in C++ stronger, because while you have different address spaces, the distinction between different hardware memory regions and their capabilities are clear, but once you have a single unified address space, potentially with cache coherency, distinguishing different memory regions becomes much more subtle. Therefore it becomes much more important to understand the various memory regions and their affinity relationships in order to achieve good performance on various hardware.

We also discussed one the more controversial aspects of P0796, that being the current representation of the system topology, still being largely hierarchical, as closely based on Hwloc. While Hwloc is highly used in many domains, it now does not always accurately represent existing machines, because it's structure is strictly hierarchical, while many machines no longer have a simply hierarchical topology. To solve this we discussed a potential graph representation for a system topology where you have node relationships that represent the containment relationships of machines, sockets, CPUs, etc, but also have node relationships that represent network and memory region connections. So the graph becomes more of an opaque system representation that can be viewed from a number of different perspectives, depending on what relationships you are interested in.

Going forward here I think we should have some further discussion of the motivation and goals and perhaps decide on some clear use cases and then at some point I would like to put together a merge request for updating the front matter of P1437, and perhaps take out the proposed interface for now.

AerialMantis commented 5 years ago

I have written up a pull request to apply the changes discussed here and in the recent heterogeneous C++ telecom - https://github.com/codeplaysoftware/standards-proposals/pull/94

AerialMantis commented 4 years ago

These changes were merged into P1437r2 for the Belfast meeting so this can be closed now.