Closed MrTomRod closed 9 months ago
Hi, Thomas!
Thank you for reporting this problem.
I think pgap.py should have a flag --no-assign-cpu-affinities to disable this.
The introduction of a call to taskset
was a result of trying to implement limits on the number of cores used by our docker image and the binaries inside. Removal of this call will result in the absence of this restriction and the application will use all available cores. If this is what you want, would you be open to considering the following workaround?
Currently the call to taskset
happens if this function returns non-zero value of cores requested (in command line, and recently, from the environment as well (PGAPX-1095)):
def get_cpus(self):
if 'SLURM_CPUS_PER_TASK' in os.environ:
return int(os.environ['SLURM_CPUS_PER_TASK'])
elif 'NSLOTS' in os.environ:
return int(os.environ['NSLOTS'])
elif self.params.args.cpus:
return self.params.args.cpus
else:
return 0
As you can see it is called specifically for the case of SLURM environment.
Here is a workaround: unsetting SLURM_CPUS_PER_TASK
for the sake of PGAP run will lead to the same result as the proposed flag I believe:
env -u SLURM_CPUS_PER_TASK ./pgap.py ...
What do you think?
Ok, that's fine. Thanks!
Describe the bug I have the same issue as on https://github.com/ncbi/pgap/issues/202 on our Rocky 8 SLURM cgroup v2 HPC. Some experimentation suggests that there is no need to do run
taskset
in pgap.py, at least under certain circumstances, see below.Suggestion I think
pgap.py
should have a flag--no-assign-cpu-affinities
to disable this. Otherwise, I have to check for updates manually and edit thepgap.py
file every time I forget to runpgap.py
without--no-self-update
.To Reproduce I can check the current CPU affinities like this:
These affinities are correctly propagated to apptainer containers:
Software versions (please complete the following information):
Additional info In case it's somehow useful, to get the assigned CPUs in Python, do this: