Open laleoarrow opened 3 months ago
Hi @laleoarrow I am not sure what is going on, but I suspect that some of the steps you are doing does not parallelize well due to non exportable objects (see https://cran.r-project.org/web/packages/future/vignettes/future-4-non-exportable-objects.html).
Can you also try the future backend and see if that helps?
library(future)
cl <- parallel::makeCluster(n)
plan(cluster, workers = cl)
r2 <- pblapply(..., cl = "future")
parallel::stopCluster(cl)
@psolymos Thx for ur prompt response! I tried to replace the original code with future backend as follows:
plan(multisession, workers = 10) # plan(sequential)
options(future.globals.maxSize = 10 * 1024^3) # i.e., 10GG; should takes ~1G for my objects in theory
res_stage1s <- pblapply(length(gmb_files):1, cl = "future", FUN = function(j){ # apply future parallelization for outer loop
#------------------outer loop-------------------#
...
gmb_file <- fread(gmb) %>% mutate_at("CHR", as.integer) # %>% reduce_data(loci, win=500) # load a file in outer loop
#------------------inner loop-------------------#
res_one_gmbs <- pblapply(1:nrow(loci), function(i){ # no parallelization for the inner loop
...
Although intuitively the ram rise slower than before, the issue persists still unfortunately and would stuck the whole process somewhere in middle.
Hi, thanks for developing such a great tool! However, I ran into some problems with it when performing a large (kinda?) loop using
pbapply
to accelerate the process. I did addgc()
andrm()
function in the loop, but it doesnt help and runs slower and slower as the loop number goes.It seems that each CPU core it used to calculate it occupied more and more RAM than it should be for one loop though. In the beginning, each thread took ~9 RAM, then it grows larger without shrinking back.
Maybe I understand pbapply somewhere wrong, but I could not locate the problems. So any thoughts or suggestion would be appreciated! Here is the code I use. (The code is running on a ARM Macbook Pro (2023) with 128 RAM.)