metrumresearchgroup / bbr

R interface for model and project management
https://metrumresearchgroup.github.io/bbr/
Other
21 stars 2 forks source link

Feat/batch submit #683

Closed seth127 closed 2 months ago

seth127 commented 2 months ago

Adding functionality for submitting models in batches. Initially, this is being implemented only for the new nm_boot_model class (bootstrap models), however it may be added to submit_models() more generally at some point.

Here is some example code for ad-hoc testing, as this develops:

batch submission example code ```r # ad-hoc testing for batch submission # # assumes the following: # * you're in the bbr repo on feat/batch_submit branch # * you have Expo1 cloned and the path specified below Sys.setenv('BBR_DEV_LOAD_PATH' = here::here()) devtools::load_all() #### create testing directory TEST_DIR <- "~/metworx_testing/batch_test" EXPO_DIR <- "~/example-projects/bbr-nonmem-poppk-foce/" # copy in data and two models: one for single-CPU and one to parallelize MODEL_DIR <- file.path(TEST_DIR, "model", "pk") DATA_DIR <- file.path(TEST_DIR, "data", "derived") if (!fs::dir_exists(MODEL_DIR)) { fs::dir_create(MODEL_DIR) fs::file_copy( file.path(EXPO_DIR, c("model/pk/bbi.yaml", "model/pk/106.ctl", "model/pk/106.yaml", "model/pk-parallel/200.ctl", "model/pk-parallel/200.yaml")), MODEL_DIR ) } if (!fs::dir_exists(DATA_DIR)) { fs::dir_create(DATA_DIR) fs::file_copy(file.path(EXPO_DIR, "data/derived/pk.csv"), DATA_DIR) } #### Run quick boostrap, threads=1 (per model) mod1 <- read_model(file.path(MODEL_DIR, "106")) .boot_run1 <- new_bootstrap_run(mod1, .overwrite = TRUE) .boot_run1 <- setup_bootstrap_run(.boot_run1, n = 100, seed = 1234) submit_model(.boot_run1, .overwrite = TRUE, .batch_size = 25) # open the log file file.edit(file.path(get_output_dir(.boot_run1), "OUTPUT")) # check on total count of finished models get_model_status(.boot_run1) #### Run longer boostrap, threads=4 (per model) mod2 <- read_model(file.path(MODEL_DIR, "200")) .boot_run2 <- new_bootstrap_run(mod2, .overwrite = TRUE) .boot_run2 <- setup_bootstrap_run(.boot_run2, n = 100, seed = 1234) submit_model(.boot_run2, .overwrite = TRUE, .batch_size = 25, .bbi_args = list(threads = 4, parallel = TRUE)) # open the log file file.edit(file.path(get_output_dir(.boot_run2), "OUTPUT")) # check on total count of finished models get_model_status(.boot_run2) ```
seth127 commented 2 months ago

Note that I tested killing the parent process and the child process did complete but this shows up at the bottom of the log (OUTPUT) file:

...
100 model(s) have finished
0 model(s) are incomplete
Finished 100 models from batch submission
Error in gzfile(file, mode) : cannot open the connection
In addition: Warning messages:
1: In file(file, mode) :
  cannot open file '/tmp/Rtmp2Iu8ww/callr-res-381f78d3d0c6': No such file or directory
2: In gzfile(file, mode) :
  cannot open compressed file '/tmp/Rtmp2Iu8ww/callr-res-381f78d3d0c6.error', probable reason 'No such file or directory'

Essentially the same error originally reported in https://github.com/r-lib/callr/issues/183 and that makes sense. We may want to look into whether there's an easy way to catch that though. It might look alarming to a user, when in fact it's expected and not a problem.

seth127 commented 2 months ago

After discussion with @barrettk , and some initial ad-hoc testing from @kyleam , we are going to merge this into feat/bootstrap. Official review of this feature (and associated test code) will happen as part of https://github.com/metrumresearchgroup/bbr/pull/671

A note on CI: the CI on this branch is failing because it branched off right before a test fix was made. Note that the PR run in CI is clean.