tlemane / kmdiff

Differential k-mer analysis
GNU Affero General Public License v3.0
33 stars 3 forks source link

hardware_concurrency returns number of all CPUs instead of available CPUs on Linux #9

Open EricDeveaud opened 6 months ago

EricDeveaud commented 6 months ago

kmdiff by default use std::thread::hardware_concurrency() to get the number of available threads. in cli.cpp

but, because there's a but ;-)

hardware_concurrency returns, when possible, the underlying hardware capability to run threads, which might not corresponds to the actual number of cores available to the process (through the use of taskset, batch system like slurm, etc...). The consequence is that kmdiff might run in a non optimal way.

For example, I've got a user that has submitted a kmc job on a 96 cores HPC nodes, in a single core slurm allocation: more than 100 threads are now fighting for the usage of this core.

I would suggest to switch to sched_getaffinity in order to get the default trhead number value.

something like that.

#include <sched.h>

int getCPUs()
{
  cpu_set_t cpu_set;
  sched_getaffinity(0, sizeof(cpu_set), &cpu_set);
  return CPU_COUNT(&cpu_set);
}

regards Eric

tlemane commented 6 months ago

Hello, Thank you for the report. Indeed sched_getaffinity seems better. I will change in the next release.