HenrikBengtsson / parallelly

R package: parallelly - Enhancing the 'parallel' Package
https://parallelly.futureverse.org
130 stars 7 forks source link

Linux: detectCores(logical = FALSE) is always equal to detectCores(logical = TRUE) #92

Open HenrikBengtsson opened 1 year ago

HenrikBengtsson commented 1 year ago

Issue

On Linux systems(*), parallel::detectCores() ignores the logical argument. For example, on my notebook with four physical CPU cores with two "processors" each, I get:

> parallel::detectCores(logical = FALSE)
[1] 8
> parallel::detectCores(logical = TRUE)  ## default
[1] 8

I think the former should really be 4.

FWIW, help("detectCores", package = "parallel") acknowledges this:

logical | Logical: if possible, use the number of physical CPUs/cores (if FALSE) or logical CPUs (if TRUE). Currently this is honoured only on macOS, Solaris and Windows.

(*) A Linux system is where R.version$os starts with linux, e.g. in R 4.2.2 on my Ubuntu 20.04 system I have R.version$os == "linux-gnu".

Troubleshooting

In both cases, detectCores() uses the following system(..., intern = TRUE) call on Linux:

$ grep "^processor" /proc/cpuinfo 2>/dev/null | wc -l
8

to get the number of cores - logical or not.

If I look at lscpu (not installed on all systems), I get:

$ lscpu 
…
CPU(s):                          8
On-line CPU(s) list:             0-7
Thread(s) per core:              2
Core(s) per socket:              4
Socket(s):                       1
…

on my machine. That suggests there are 8 logical CPU cores, and 4*1 = 4 physical ones. We can query the raw data for this from /proc/cpuinfo as:

$ cat /proc/cpuinfo | grep -E "^(processor|core id)"
processor   : 0
core id     : 0
processor   : 1
core id     : 1
processor   : 2
core id     : 2
processor   : 3
core id     : 3
processor   : 4
core id     : 0
processor   : 5
core id     : 1
processor   : 6
core id     : 2
processor   : 7
core id     : 3

which explains:

$ grep "^processor" /proc/cpuinfo 2>/dev/null | wc -l
8

for counting the number of logical CPU cores. However, for the physical ones, I think we should count the unique number of core IDs;

$ grep "^core id" /proc/cpuinfo 2>/dev/null | sort -u | wc -l
4

Action

So, this begs the question, why doesn't parallel::detectCores() do this? I'm pretty sure this has been discussed somewhere before. The first task is to identify any discussions and rationales for the current implementation.

See also