PyWiFeS / pipeline

The Python data reduction pipeline for WiFeS
6 stars 26 forks source link

Improve estimate of available cores for multithreading. #36

Open timothyfrankdavies opened 7 months ago

timothyfrankdavies commented 7 months ago

Currently we use os.cpu_count() in wise_wsol.py to decide how many batches of jobs to launch. It's also used in some debug output.

That returns the number of CPU cores on hardware, but some job scheduling tools like Slurm limit the number of CPUs available to each process.

I tried using len(os.sched_getaffinity(0)), which worked on OzStar, but crashes on macOS.

There are a few changes that may be worth making:

  1. Use a try catch, falling back from len(os.sched_getaffinity(0)) to os.cpu_count()
  2. Check sys.platform, and only use os.sched_getaffinity on Unix platforms.
  3. Add a command line argument for the maximum number of cores to use (including hyperthreads).

And at the same time, it might be worth letting multithread be set by command line instead of/as well as the json configs.

For each of these options there's some decisions to be made on where to put the helper function, and when to call it.

All in all not too much work, but too much for #32.