Currently, when running DAPHNE with the --vec flag, the usage of hwloc_topology_load() inside of MTWrapperBase causes a significant overhead in execution time as this is repeated for every single call to a _vectorizedPipeline__* kernel. This causes us to repeatedly read sysfs files which each needs multiple syscalls.
Possible solutions could be:
filter unnecessary information with hwloc_topology_set_all_types_filter
generate them once with lstopo and use HWLOC_XMLFILE=/path/to/topology.xml and HWLOC_THISSYSTEM=1
ideally, we get all hwloc queries out of the _vectorizedPipeline__* kernel to not repeat this work
Currently, when running DAPHNE with the
--vec
flag, the usage ofhwloc_topology_load()
inside ofMTWrapperBase
causes a significant overhead in execution time as this is repeated for every single call to a_vectorizedPipeline__*
kernel. This causes us to repeatedly readsysfs
files which each needs multiple syscalls.Possible solutions could be:
hwloc_topology_set_all_types_filter
lstopo
and useHWLOC_XMLFILE=/path/to/topology.xml
andHWLOC_THISSYSTEM=1
hwloc
queries out of the_vectorizedPipeline__*
kernel to not repeat this work