mhx / dwarfs

A fast high compression read-only file system for Linux, Windows and macOS
GNU General Public License v3.0
2.16k stars 58 forks source link

std thread hardware_concurrency() should not be used #130

Closed jbd closed 1 year ago

jbd commented 1 year ago

Hello,

std::thread::hardware_concurrency() returns, when possible, the underlying hardware capability to run threads, which might not corresponds to the actual number of cores available to the process (through the use of taskset, batch system like slurm, etc...). The consequence is that mkdwarfs might run in a non optimal way. For example, if I run taskset -c 1 mkdwarfs on my 20 cores machines, it will run 20 workers on only one core.

The immediate workaround is to use the -N option to set the number of workers, but I think a more sane behavior would be to use sched_getaffinity as in https://github.com/opencv/opencv/issues/16268. Gromacs did something similar (https://github.com/gromacs/gromacs/blob/1e6873fadf16d5f5be861e6f9ef5f9923a12e540/src/gromacs/hardware/hardwaretopology.cpp#L1221).

What do you think ?

Thank you.

jbd commented 1 year ago

In fact, the solution from folly/folly/system/HardwareConcurrency.cpp is great !

#include <folly/system/HardwareConcurrency.h>

#include <thread>

#include <folly/portability/Sched.h>

namespace folly {

unsigned int hardware_concurrency() noexcept {
#if defined(__linux__) && !defined(__ANDROID__)
  cpu_set_t cpuset;
  if (!sched_getaffinity(0, sizeof(cpuset), &cpuset)) {
    auto count = CPU_COUNT(&cpuset);
    if (count != 0) {
      return count;
    }
  }
#endif

  return std::thread::hardware_concurrency();
}

} // namespace folly
mhx commented 1 year ago

Hey, thanks for the report! Switching to the folly implementation shouldn't be too much trouble. I'll consider that for the next release.

mhx commented 1 year ago

Fixed in e4971b4.