Very slow scheduling - Githubissues

rimorob commented 4 years ago

I am sorry to not have a reproducible example yet. My code base is very large and was running just fine until the job size became small. So I'll make a reproducible example after hearing some suggestions on what to test. In my case, the jobs are rather small - 10s each. The problem I'm seeing is they don't get scheduled very quickly. In fact, at any given time only one slurm job, or at most two, are running (the machine they are running on can run ~15 jobs by ram and cpu requirements). I'm trying to run 60 chunks, and to resolve this problem I set scheduling to 5, which did bump up the number of running jobs to 2-3. However, the main problem is that the chunks seem to take 10-15 seconds to launch, and I don't know what I changed - a few days ago, with larger jobs - this was not the case. So to my specific questions, before I try to generate a small reproducible example:

Is there a verbose/trace flag? I can't see it in the documentation
Is there anything to pay attention to beyond the "scheduling" option? Any hunch for what may be causing this issue for me to start looking at in generating an example?

rimorob commented 4 years ago

Just to add some detail - I think I've come a bit closer to the source of the problem and may have found an inefficiency with scheduling. First of all, the exported dataset is rather large and may be as large as 100MB. Second, the folder where the .future directory is located is on a small drive that went from 87% before the job started to 100% when it crashed (this being the cause of crash). There's a very good chance that the slowdown has to do with the home folder filling up. Assuming that this was, in fact, the problem, I wanted to note a few things, with the caveat that I know little about the future.batchtools internal architecture and how doFuture interacts with it. When I try to run a number of chunks and they share all of the data that's being exported, three seem to be 100 copies of 100-mb large object being staged in the same folder before being pushed to the remote node. Would it be possible to have a single copy output? Another possibility is that these stored files belong to separate (and many) consecutive foreach runs. If this is the case, foreach doesn't seem to clean up after itself properly, and the doFuture package doesn't seem to have a visible command to clean up (an equivalent to stopCluster()). Does any of this seem like the possible cause of the problem?

rimorob commented 4 years ago

Sorry about adding to this thread all the time, but as I continue to research what's going on, other things come to light. I found this illuminating thread (thank you, Henrik): https://github.com/HenrikBengtsson/doFuture/issues/34 Apparently the default in recent versions of doFuture package is to clean up after the job finishes (future.delete=TRUE), and earlier that was the only option. So I've set this flag to TRUE just in case the default was, in fact, FALSE (I can't find this flag in the documentation). This didn't help. However, the folders in .future accumulate and grow, and the behavior of doFuture is very strange. The first foreach run schedules all jobs and runs quickly. The second and subsequent runs generate a large number of folders in .future root as the iterations run, but only one or two batches run in parallel, and the rest - the large number that I observed - are just sitting there waiting for the cluster to run the whole set one at a time. I will definitely need pointers on how to begin to troubleshoot this, since I don't understand why the runs slow down so dramatically.

rimorob commented 4 years ago

Further update - after a few iterations of foreach, saved objects accumulate and eventually fill the disk. In the early runs, which I monitored for an hour, despite slowness all files were being deleted, but after a while a large number of runs - 12 runs of my entire foreach loop's worth (each run being 60 jobs, so > 660 jobs total) are found in the .future folder. There's some sort of a cleanup issue that I would welcome help in troubleshooting.

rimorob commented 4 years ago

I have been able to reproduce the problem in small scripts, and also to narrow it down quite a lot. Specifically, the following order of operations works:

future:plan()

for (i in 1:10) { 
  foreach(j = 1:20) %dopar% {
    run with metaparameter[i]
  }
}

These orders of operations DO NOT work and match scripts 2.1, 2.2, 2.3 (attached) respectively. testCluster.R works. clusterTest.zip

for (i in 1:10) { 
future:plan()
foreach(j = 1:20) %dopar% {
run with metaparameter[i]
}
}

foo = function (i) { future::plan(); foreach(j = 1:20) %dopar% {run with metaparameter[i]}
for (i in 1:10) {
foo(i)
}

foo = function (i) { foreach(j = 1:20) %dopar% {run with metaparameter[i]}
future::plan()
for (i in 1:10) {
foo(i)
}

chances are, I need some variant of #3 but some variables set by plan() are not visible to doFuture in this case and it runs locally and in a single process.

clusterTest.zip

HenrikBengtsson commented 4 years ago

Thanks for this. Could you please update to make use of Markdown code blocks to make this a bit easier to read? See https://guides.github.com/features/mastering-markdown/ (= the 'M↓' icon in the lower right of every comment field here) - click 'code' panel under Section 'Examples'.

rimorob commented 4 years ago

Done. Any thoughts on the reason for the problem? In summary, main symptoms are slowdown (one job at a time) and failure to clean up files in .future folder.

rimorob commented 4 years ago

I have a (hackish) solution. My suspicion was that plan() was creating a variable in the parent environment - and it is, I can see it in the %plan% function. Therefore, instead of the invocation future::plan() I call

uberPlan <<- future::plan()
...
myFunction <- function() {
  plan(uberPlan)
  foreach(){
  }
}
for(i in 1:10) {
  myFunction()
}

and this solves the problem!!! So this is a variant of 3 but with a globally stored plan. What's weird is, if I try to create a new plan each time, jobs collide and fill up the hard drive. This is a real (and serious) bug; and I also think that my solution of using an external variable is quite hackish and probably should be addressed inside the plan() function somehow. But in any case, I think this narrows down the scope of work for the package and provides a work-around in the meantime.

rimorob commented 4 years ago

I wrote too soon. My "fix" results in always-local execution.

rimorob commented 4 years ago

And I now it works. The work-around is ugly. the first invocation of plan() returns the sequential strategy, so I have to call it twice to get the "latest and greatest" value of the stack global variable from the package environment. It would really help to have an accessor method to get the current stack after plan() is executed. That said, apparently my strategy of making the plan a global variable is exactly what's implemented under the hood... hmm...

HenrikBengtsson / doFuture

Very slow scheduling #47