Open billytcl opened 1 year ago
Data.table uses per default 50% of the available virtual cores. You can raise this limit, e.g., by setting Sys.setenv(R_DATATABLE_NUM_PROCS_PERCENT="90")
There may be multiple steps in fread that may not be parallelized. So for example if your file has character columns then a lot of time will be spent single threaded. I suggest to try running forder (or frollmean algo=exact) in a loop and then observe top
.
from the output it looks like data.table is using 3 threads out of 6 on that cluster node, so I'm not sure this is a problem with data.table, and you may consider closing the issue. When using SLURM you can tell data.table to use all SLURM CPUs via
data.table::setDTthreads(as.integer(Sys.getenv("SLURM_JOB_CPUS_PER_NODE", "1")))
when using 3 threads, you would have at best 3x speedups relative to a single thread, but that would be only in an ideal case. related to #2687 we should add some docs to clarify how exactly openmp is used, so people can have realistic expectations of when speedups should happen.
in fread.c the only instance of pragma omp for I see is
#pragma omp for ordered schedule(dynamic) reduction(+:thRead,thPush)
for (int jump = jump0; jump < nJumps; jump++) {
but I am not an expert on fread so I am not sure what exactly happens in this for loop, and if using several threads in this for loop should result in big speedups.
When using SLURM you can tell data.table to use all SLURM CPUs via
data.table::setDTthreads(as.integer(Sys.getenv("SLURM_JOB_CPUS_PER_NODE", "1")))
Note that SLURM_JOB_CPUS_PER_NODE
may hold multi-host values, e.g. 4,8
and 10,2(x3)
. This depends on what parallel resources the Slurm job requested. If you're interested in the number CPUs allotted on the current machine, I think you want to use SLURM_CPUS_ON_NODE
instead - that holds an integer scalar.
I'm using data.table on a SLURM cluster and for some reason it's having trouble using multiple cores on something as simple as fread, even though it's detecting them when loading the library. The file is a 46GB tab-delimited file in 4-column long format.
When I ssh into the node, it's not even using all of the CPUs:
I can verify that when I use it on our personal workstations it is using multiple threads. How should I go about troubleshooting this? My guess is that SLURM/R/data.table are having some kind of weird interaction that is not provisioning the CPUs properly.
#
Output of sessionInfo()