tidyverse / multidplyr

A dplyr backend that partitions a data frame over multiple processes
https://multidplyr.tidyverse.org
Other
642 stars 74 forks source link

Cores and Logical Processors #119

Closed aavanesy closed 1 year ago

aavanesy commented 3 years ago

Hi,

I just discovered the package and this is the solution I was looking for over the last couple of years. Thank you a lot for this!

My question is regarding the cores vs logical processors.

My computer has 4 cores but 8 logical processors but results are identical for both 4 and 8 clusters. Is this normal and will the package eventually cover all 8 processors?

When I use doSNOW for parallel computing I use up to 20 clusters locally and there is significant difference when using 4 or 20.

I was hoping something similar here.

hadley commented 1 year ago

Unfortunately this is mostly out of multidplyr's hands: it creates a bunch of processes and it's up the the OS how they are spread across cores/processors.