Open smilesun opened 7 years ago
An easy solution would be to save intermediate results following this example:
save.file = "~/mboState_run01.RData" # a file that can be accessed from all nodes on the cluster
ctrl = makeMBOControl(save.on.disk.at = 0L:50L, save.file.path = save.file)
ctrl = setMBOControlTermination(ctrl, iters = 50L)
or = mbo(f, control = ctrl)
# after this timed out
or = mboContinue(save.file)
# or if you don't want to further continue the optimization with the left budget just call
or = mboFinalize(save.file)
Side question: Do you use batchtools or BatchJobs?
Thanks, I do not know one could provide a file in "makeMBOControl" before, yes, I am using batchtools for tuning ML hyperparameters. Now I need to find a way to organize maybe 100 files and continue them afterwards in another call.
You should use the imputation mechanism of mbo. I can senden you some example code Tomorrowland.
the problem is deeper.
a) the solution from @jakob-r doesnt work that well, i want to run my point eval in a separate process. the continuation procedure is more a last measure, i want something robust.
b) there is the runexec tool that we really should support very soon. this solves our problem on all systems once and for all. https://github.com/sosy-lab/benchexec
c) @smilesun can you use batchtools to generate your points? this would on a cluster ensure that you run in a separate process. please post a MINIMAL example so we can look at that.
Notes for discussion
Solution 1: set the time limit directory
Solution 2 : parallel xgboost (allocate multicore for one job)
Solution 3: multiple point ( allocate multicore for one job)
If one runs hyper-parameter optimization with mlrMBO on a cluster and due to the resources limit(memory limit or time limit for example) the scheduling system has to kill the process. In this case, is there a way to write an easy Exception handling code snip to still get the current best result?