Closed kosmitive closed 1 year ago
I'm not sure I follow. What's the "expected number of cores"? if you mean checking whether the user is oversubscribing the system, this would be useful but seems hard to do reliably. One would have to have a separate thread counting child processes and comparing this with n_jobs, but the user might legitimately want to start 10 jobs each using 4 cores, in order to use 40 vCPUs. Alternatively one could compare child processes with num_available_cpus() or whatever, but then again, this does not protect from multithreaded parallelization in worker processes, e.g. because of linear algebra libraries. In the end it seems better to educate users as to the complexity of parallelising, with better and more thorough documentation, and lots of repetition in as many places as possible.
Exact checks might be difficult. Warnings for potential oversubcriptions:
All in all, I think we need to educate users better. Automagic problem finding feels like a rabbit hole.
We should test and spit out a warning if the expected number of cores exceeds the physical numbers, e.g. by multi-threaded multi-processing.