chapel-lang / chapel

a Productive Parallel Programming Language
https://chapel-lang.org
Other
1.79k stars 421 forks source link

Chapel hwloc finds "No useable cores" for restricted cpuset #24371

Closed milthorpe closed 7 months ago

milthorpe commented 9 months ago

When running Chapel with CHPL_TASKS=qthreads and CHPL_HWLOC=bundled (or any value other than none), cores are discovered by hwloc according to whether they are accessible, the kind of processing units available, and perhaps other criteria. However, on a system where processing units are restricted (e.g. by a cpuset) so that they are inaccessible to Chapel, any core that contains one of the restricted PUs will also be entirely inaccessible to Chapel. This has the unfortunate effect that on a HPC system that disables SMT through a default cpuset (e.g. NCI Gadi), Chapel will fail to discover any cores, and print "No useable cores."

The reason is that the loop in hwloc.c that adds cores to the physically accessible set calls hwloc_get_next_obj_inside_cpuset_by_type to find the next object of type HWLOC_OBJ_CORE in the logAccSet (which only includes accessible PUs), however, this function tries to find cores that are completely included in the provided cpuset, i.e. cores for which each PU is included in the provided cpuset. In the Gadi example, logAccSet only includes one of the two hyperthreaded PUs for each code, so no core is completely included in logAccSet, hence the result of "No useable cores."

Steps to Reproduce

To reproduce this problem on a system with hyperthreading, run a Chapel program using a cpuset that only includes the first PU of each physical core. For example, on my 16-core Skylake system (each core is PU [i,i+16]):

BabelStream/src/chapel$ sudo cset set --cpu=0-15 --set=my_cpuset1
BabelStream/src/chapel$ sudo cset proc --set=my_cpuset1 --exec ./chapel-stream
cset: --> last message, executed args into cpuset "/my_cpuset1", new pid is: 6963
error: No useable cores.

Possible Fix

One possible fix (PR to come) is to iterate the full logical set logAllSet of cores and check for each core whether it intersects with logAccSet, i.e., whether some PU of the core is accessible to Chapel, before adding it to physAccSet.

Configuration Information

bradcray commented 9 months ago

@milthorpe : Thanks for both reporting this and investigating enough to suggest a possible path forward.

@jhh67 : Would you be able to pick this up as the resident hwloc expert?

[edit: oops, I see you've already said you would on email... hadn't gotten there yet]

lydia-duncan commented 9 months ago

It looks like Josh opened a PR, too

bradcray commented 9 months ago

Thanks for pointing that out, Lydia! It's here: https://github.com/chapel-lang/chapel/pull/24372

jhh67 commented 9 months ago

At the moment I'm currently unable to reproduce this bug as I'm unable to successfully run cset on a machine. At best, the way the cset manages cpusets is different from how slurm/srun binds cpus to processes because according to this bug report with cset there are cores in the topology that have cpusets that are supersets of its PUs' cpusets because some of the PUs are inaccessible. So, for example, the cpuset for the core has two bits set but it has only one PU child in the topology that has a single bit set. So the call to hwloc_get_next_obj_inside_cpuset_by_type doesn't find the core. However, when I've used slurm/srun to test this functionality the core cpuset is the union of its PUs's cpusets so this bug does not occur. At worst, this difference indicates a fundamental misunderstanding on my part of how hwloc represents topologies and if that's true I worry that there are other latent bugs.

I will continue to try to find a platform on which to reproduce this bug, and once I do I'll add a test that triggers it.

bradcray commented 8 months ago

@jhh67 @milthorpe : Was this resolved by https://github.com/chapel-lang/chapel/pull/24372 or is more required here?

jhh67 commented 7 months ago

It should be resolved by https://github.com/chapel-lang/chapel/pull/24372, although I was unable to replicate the error that @milthorpe reported.

milthorpe commented 7 months ago

This issue was resolved by #24372. I have confirmed that all accessible cores are usable by Chapel 2.0 on NCI Gadi even when a restricted cpuset is used.