Closed GoogleCodeExporter closed 9 years ago
I checked this issue for the current development version of and found multiple
problems:
The backend hwloc only supplies data like CPU model number when CPU 0 is
contained in the cgroup's cpuset.
The cpuid backend works but reads a wrong number of HW threads, therefore the N
affinity group in likwid-pin prints the active CPUs followed by unusable IDs
for the remaining CPUs.
My suggestion is to get the number of CPUs from /proc/self/status and do a
fallback to cpuid if hwloc fails retrieving data like CPU model number.
I did not check it for 3.1.3 version but since the 3.1.3 version uses only the
cpuid backend, comparable the the current development one, there will be
similar errors.
I attached the patch from the HPC UGent github repo.
Original comment by Thomas.R...@googlemail.com
on 9 Feb 2015 at 3:08
Attachments:
/proc/self/status would probably work, but can you not simply read the thread
affinity mask? That would work in all POSIX cases?
Original comment by wpoel...@gmail.com
on 9 Feb 2015 at 3:17
There are multiple places where we can get the affinity mask, that is not the
problem. But bigger changes are needed to support them. Currently LIKWID makes
some assumptions that are not met when using cpusets. A small example is the
topology code where we want to collect the topology info of the whole machine,
not only of the parts that are controlled by the CPUs in the cpuset. In other
cases we want the actual CPUs of the execution environment.
Original comment by Thomas.R...@googlemail.com
on 10 Feb 2015 at 12:07
This issue was closed by revision r482.
Original comment by Thomas.R...@googlemail.com
on 13 Feb 2015 at 3:26
I implemented a better cgroup handling for LIKWID. The most problematic issue
was that neither hwloc nor cpuid can read the system topology of the whole
machine if in a cpuset. Therefore I wrote a new interface that gets all
information from procfs/sysfs. For the affinity system, only the CPUs, that are
part of the current cpuset, are added to the domains. There might be the case
that affinity domains contain no CPUs now. Since the LIKWID system now only
knows these CPUs in the cpuset, no changes to the pinning library are needed.
The topology output code does not mark the CPUs that are present in the cgroup
but this can be easily done by appending a '*' or print it in color.
Original comment by Thomas.R...@googlemail.com
on 13 Feb 2015 at 3:33
Original issue reported on code.google.com by
wpoel...@gmail.com
on 3 Feb 2015 at 10:00